Workshop Attendees: - PowerPoint PPT Presentation

1 / 101
About This Presentation
Title:

Workshop Attendees:

Description:

Title: Slide 1 Author: HSLS Last modified by: YiBu Chen Created Date: 3/2/2005 2:27:53 PM Document presentation format: On-screen Show Company: University of Pittsburgh – PowerPoint PPT presentation

Number of Views:268
Avg rating:3.0/5.0
Slides: 102
Provided by: HSLS6
Category:

less

Transcript and Presenter's Notes

Title: Workshop Attendees:


1
Online Resources for Genetic Variation Study
Part One
Workshop Attendees Please complete the workshop
sign-in form. To help us develop bioinformatics
workshops that are more relevant to your
research, please take our online User Needs
Survey, thanks!
NML Bioinformatics Service User Needs Survey
From the NML-Bioinformatics Web Page ? Click
the NML Support Requests under the Support
Request Section ?Click the this online user
needs survey under the Tell us how to serve
your information needs better! Section.
2
Online Resources for Genetic Variation Study
Part One
Yi-Bu Chen, Ph.D. Bioinformatics Specialist
Norris Medical Library University of Southern
California 323-442-3309 yibuchen_at_belen.hsc.usc.e
du
Dec. 6, 2007
3
Workshop Outline
  • Overview of Bioinformatics Support Program at NML
  • Human Genetic Variation Overview
  • Main types of genetic variations
  • Basics of the single nucleotide polymorphisms
    (SNPs)
  • NCBI Genetic Variation Resources dbSNP and OMIM
  • dbSNP overview
  • dbSNP search examples
  • OMIM overview
  • International HapMap Project
  • The HapMap project overview and major findings
  • HapMap search examples
  • The Perlegen Genetic Variation Database
  • Genome Variation Server (SeattleSNPs)
  • Ensembl SNPs
  • Hands-on Search Question

4
Polymorphisms How different are we?
Human vs. Chimp 96 overall (99 similar in
terms of SNPs)
Human vs. Human 99.9 similar with around 3.2
million single nucleotide differences (account
for up to 90 of all genomic variations, total
possible SNPs near 12 millions)
Adapted from a lecture slide by Jonathan Wren, NYU
5
Why do we care about genetic variations?
1. Genetic variations underlie phenotypic
differences among different individuals
2. Genetic variations determine our
predisposition to complex diseases and responses
to drugs and environmental factors
3. Genetic variations reveal clues of ancestral
human migration history
6
Main Types of Genetic Variations
  • A. Single nucleotide mutation
  • Resulting in single nucleotide polymorphisms
    (SNPs)
  • Accounts for up to 90 of human genetic
    variations
  • Majority of SNPs do NOT directly or significantly
    contribute to any phenotypes
  • B. Insertion or deletion of one or more
    nucleotide(s)
  • 1. Tandem repeat polymorphisms
  • Tandem repeats are genomic regions consisting of
    variable length of sequence motifs repeating in
    tandem with variable copy number.
  • Used as genetic markers for DNA finger printing
    (forensic, parentage testing)
  • Many cause genetic diseases
  • Microsatelites (Short Tandem Repeats) repeat
    unit 1-6 bases long
  • Minisatelites repeat unit 11-100 bases long
  • 2. Insertion/Deletion (INDEL or DIPS)
    polymorphisms
  • Often resulted from localized rearrangements
    between homologous tandem repeats.
  • C. Gross chromosomal aberration
  • Deletions, inversions, or translocation of large
    DNA fragments
  • Rare but often causing serious genetic diseases

7
How many variations are presentin human genome?
  • SNPs appear once per 0.1-1 kb interval or on
    average 1 per 300 bp. Considering the size of
    entire human genome (3.2 x109 bp), the total
    number of SNPs is well above 11 million. The
    high density and relatively easier assay make
    SNPs the ideal genomic markers.
  • In sillico estimation of potentially polymorphic
    variable number tandem repeats (VNTR) are over
    100,000 across the human genome
  • The short insertion/deletions are very difficult
    to quantify and the number is likely to fall in
    between SNPs and VNTR.

8
Types of Single Base Substitutions
  • Transitions
  • Change of one purine (A,G) for another purine,
    or a pyrimidine (C,T) for another pyrimidine
  • Transversions
  • Change of a purine (A,G) for a pyrimidine (C,T),
    or vice versa.
  • The cytosine to thymine (CgtT) transition accounts
    for approximately 2 out of every 3 SNPs in human
    genome.

9
SNP or Mutation?
  • Call it a SNP IF
  • the single base change occurs in a population at
    a frequency of 1 or higher.
  • Call it a mutation IF
  • the single base change occurs in less than 1 of
    a population.
  • A SNP is a polymorphic position where the point
    mutation has been fixed in the population.

10
From a Mutation to a SNP
11
SNPs Classification
  • SNPs can occur anywhere on a genome, they are
    classified based on their locations.
  • Intergenic region
  • Gene region
  • can be further classified as promoter region,
    and coding region (intronic, exonic, promoter
    region, UTR, etc.)

12
Coding Region SNPs
  • Synonymous
  • Non-Synonymous
  • Missense amino acid change
  • Nonsense changes amino acid to stop codon.

Geospiza Green Arrow tutorial by Sandra Porter,
Ph.D.
13
The Consequences of SNPs
  • The phenotypic consequence of a SNP is
    significantly affected by the location where it
    occurs, as well as the nature of the mutation.
  • No consequence
  • Affect gene transcription quantitatively or
    qualitatively.
  • Affect gene translation quantitatively or
    qualitatively.
  • Change protein structure and functions.
  • Change gene regulation at different steps.

14
Simple/Complex Genetic Diseases and SNPs
  • Simple genetic diseases (Mendelian diseases) are
    often caused by mutations in a single gene.
  • -- e.g. Huntingtons, Cystic fibrosis, PKU, etc.
  • Many complex diseases are the result of mutations
    in multiple genes, the interactions among them as
    well as between the environmental factors.
  • -- e.g. cancers, heart diseases, Alzheimer's,
    diabetes, asthmas, etc.
  • Majority of SNPS may not directly cause any
    diseases.
  • SNPs are ideal genomic markers (dense and easy to
    assay) for locating disease loci in association
    studies.

15
(No Transcript)
16
Main Genetic Variation Resources
  • NCBI dbSNP
  • http//www.ncbi.nlm.nih.gov/SNP/index.html
  • NCBI Online Mendelian Inheritance in Man (OMIM)
  • http//www.ncbi.nlm.nih.gov/sites/entrez?dbOMIM
  • International HapMap Project
  • http//www.hapmap.org/
  • Perlegen
  • http//genome.perlegen.com
  • Genome Variation Server (Seattle SNPs)
  • http//gvs.gs.washington.edu/GVS/

17
Where to Find Bioinformatics Resources for
Genetic Variation Studies?
  • OBRC Online Bioinformatics Resources Collection
    (Univ. of Pittsburgh)
  • http//www.hsls.pitt.edu/guides/genetics/obrc
  • The most comprehensive annotated bioinformatics
    databases and software tools collection on the
    Web, with over 200 resources relevant to genetic
    variation studies.
  • HUGO Mutation Database Initiativehttp//www.hgvs.
    org/dblist/dblist.html

18
NCBI dbSNP Database Overview
  • URL http//www.ncbi.nlm.nih.gov/SNP/index.html
  • The NCBIs Single Nucleotide Polymorphism
    database (dbSNP) is the largest and primary
    public-domain archive for simple genetic
    variation data.
  • The polymorphisms data in dbSNP includes
  • Single-base nucleotide substitutions (SNPs)
  • Small-scale multi-base deletions or insertions
    variations (also called deletion insertion
    polymorphisms or DIPs or INDELs)
  • Microsatellite tandem repeat variations (also
    called short tandem repeats or STRs).

19
dbSNP Data Stats (build 128, Oct, 2007)
  • http//www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi

20
dbSNP Data Types
  • The dbSNP contains two classes of records
  • Submitted record
  • The original observations of sequence variation
    submitted SNPs (SS) records started with ss
    (ss5586300)
  • Computationally annotated record
  • Generated during the dbSNP "build" cycle by
    computation based the original submitted data,
    Reference SNP Clusters (ref SNP) start with rs
    (rs4986582)

21
dbSNP Submitted Record
  • Provides information on the SNP and conditions
    under which it was collected.
  • Provides links to collection methods (assay
    technique), submitter information (contact data,
    individual submitter), and variation data
    (frequencies, genotypes).

ss5586300
22
From Submitted Record to Reference SNP Cluster
SNP position mapped to the reference genomic
contigs
SNPs records submitted by researchers
If the SNP position not unique, it will be
assigned to the existing RefSNP cluster
If the SNP position is unique, a new RS is
assigned
23
Different Ways to Search SNPs in dbSNP
  • dbSNP Web site
  • http//www.ncbi.nlm.nih.gov/SNP/index.html
  • Direct search of SS record batch search allow
    SNP record submission NO search limits
  • Entrez SNP http//www.ncbi.nlm.nih.gov/sites/entr
    ez?dbSnp
  • Search limits options allows precise retrieval
  • Entrez Gene Records SNP Links Out Feature
  • Direct links to corresponding SNP records
    access to genotype and linkage disequilibrium
    data
  • NCBIs MapViewer
  • Visualize SNPs in the genomic context along with
    other types of genetic data.

24
Search SNPs from dbSNP Web Page
  • dbSNP Web site
  • http//www.ncbi.nlm.nih.gov/SNP/index.html

25
Search SNPs from Entrez SNP Web Page
  • Entrez SNP http//www.ncbi.nlm.nih.gov/sites/entr
    ez?dbSnp
  • The dbSNP is a part of the Entrez integrated
    information retrieval system and may be searched
    using either qualifiers (aliases) or a
    combination search limits from 14 different
    categories.

26
Entrez SNP Search Limits
  • Organisms
  • Chromosome (including W and Z for non-mammals)
  • Chromosome Ranges
  • Map Weight (how many times in genome)
  • Function Class (coding non-synonymous intron
    etc.)
  • SNP Class (types of variations)
  • Method Class (methods for determining the
    variations)
  • Validation Status (if and how the data is
    validated)
  • Variation Alleles (using IUPAC- codes)
  • Annotation (Records with links to other NCBI
    database)
  • Heterozygosity ( of heterozygous genotype)
  • Success Rate (likelihood that the SNP is real)
  • Created Build ID
  • Updated Build ID

http//www.ncbi.nlm.nih.gov/portal/query.fcgi?dbS
np
http//www.ensembl.org/common/helpview?kwsnpview
ref
27
Search dbSNP Example 1
Some mutations on human BRCA1 gene have been
reported to be involved in the early onset of
breast cancer. Retrieve all validated
non-synonymous coding reference SNPs for BRCA1
from dbSNP.
Hint starting from the Entrez SNP
http//www.ncbi.nlm.nih.gov/sites/entrez?dbSnp
28
Entrez SNP Search Results Example 1
29
dbSNP Ref SNP Record Example 1 Summery
http//www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs498
6852
This Ref SNP cluster contains multiple submitted
SNP records from different groups
30
dbSNP Ref SNP Record Example 1 SNP position and
the flank region
31
dbSNP Ref SNP Record Example 1 GeneView of an
individual SNP
Because of alternative splicing, the very same
SNP can locate in different region of the
transcripts.
32
dbSNP Ref SNP Record Example 1 TableView of an
individual SNP
Notice that the individual SNP is mapped to the
same position on the reference genomic contig,
but different positions on mRNAs and proteins due
to alternative splicing.
33
dbSNP Ref SNP Record Example 1 Links to Various
Annotated NCBI Databases
Link to the OMIM record where documented clinical
and genetic data of this SNP can be found.
Warning the lack of OMIM link does not necessary
mean that this SNP is unrelated to any OMIM
record.
34
dbSNP Ref SNP Record Example 1 Population
Allele Frequency, Genotype and Heterozygosity Data
Link to the detailed population genotype data.
Data from National Cancer Institute.
Data from The NIH Polymorphism Discovery Resource
Data from Centre d'Etude du Polymorphisme Human
(CEPH).
Data from the International HapMap Project.
35
dbSNP Ref SNP Record Example 1 GeneVeiw and
SequenceView of ALL SNPs
36
dbSNP Ref SNP Record Example 1 Links to View
SNPs on 3D Structure, Conserved Domains, and
Multiple Sequence Alignment
37
Search dbSNP Example 2
Mutations in Dopamine Receptor 5 (DRD5) gene have
been observed in patients with various
neurological disorders. Find how many refSNP
records have been reported for DRD5. Show all
refSNPs in the context of a chromosome.
Hint starting from the Entrez Gene
http//www.ncbi.nlm.nih.gov/sites/entrez?dbgene
38
Search dbSNP SNP Links from Entrez Gene Record
39
Search dbSNP SNP Display Using NCBI Map Viewer
40
Search dbSNP Configure Map Viewer to Display
other Relevant Data
41
SNPs Display in Map Viewer Legend
Click on any column headings to see the refSNPs
legend.
http//www.ncbi.nlm.nih.gov/SNP/get_html.cgi?which
Htmlverbose
42
SNPs Display in Map Viewer Legend
43
Online Mendelian Inheritance in Man (OMIM) A
Brief Overview
  • URL http//www.ncbi.nlm.nih.gov/entrez/query.fcgi
    ?dbOMIM
  • OMIM is a human genetic disorders database built
    and curated using results from published studies.
  • Each OMIM record provides a summary of the
    current state of knowledge of the genetic basis
    of a disorder, which contains the following
    information
  • description and clinical features of a disorder
    or a gene involved in genetic disorders
  • biochemical and other features
  • cytogenetics and mapping
  • molecular and population genetics
  • diagnosis and clinical management
  • animal models for the disorder
  • allelic variants.
  • OMIM is searchable via NCBI Entrez, and its
    records are cross-linked to other NCBI resources.

44
Online Mendelian Inheritance in Man Stats
  • http//www.ncbi.nlm.nih.gov/Omim/mimstats.html

45
OMIM Allelic Variants
  • The OMIM database includes genetic disorders
    caused by various mutation/variation, from SNPs
    to large-scale chromosomal abnormalities.
  • The listed allelic variants are searchable
    through the "Allelic Variants" field.
  • Single nucleotide substitutions (SNPs)
  • small insertions and deletions (INDEL/DIPS)
  • frame shifts caused by these INDELs.
  • Allelic variants are represented by a 10-digit
    OMIM number, and can be searched in two ways
  • Search for a gene or a disease, when retrieved,
    view its allelic variants.
  • Use the Limits to narrow your search to
  • -- retrieve only records that contain allelic
    variant information
  • -- search for particular terms within the
    allelic variants field.

46
Notes on OMIM Allelic Variants
  • For most genes, only selected mutations are
    included
  • Criteria for inclusion include the first
    mutation to be discovered, high population
    frequency, distinctive phenotype, historic
    significance, unusual mechanism of mutation,
    unusual pathogenetic mechanism, and distinctive
    inheritance.
  • Most of the allelic variants represent
    disease-producing mutations, NOT polymorphisms.
  • A few polymorphisms are included, many of which
    show a positive statistical correlation with
    particular common disorders.
  • Few neutral polymorphisms are included in OMIM.
  • Some SNPs in the dbSNP records are not linked to
    the corresponding OMIM records.

http//www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id
113705
47
Sequence variations view in UniProt Beta
http//beta.uniprot.org/uniprot/P38398
48
Assessing Polymorphisms Genotypes and Genotyping
  • Genotype Each person has two copies of all
    chromosomes except the sex chromosomes. The set
    of alleles at a given locus forms the genotype.
  • Genotyping the process of identifying what
    genotype a person has for any given locus (loci).
  • Whole-genome genotyping of all SNPs in a human
    genome? (11.8 million and counting)
  • Technologically daunting
  • Prohibitively expensive and time consuming

49
Assessing Polymorphisms the Origin of Haplotype
  • Two ancestral chromosomes scrambled through
    recombination over many generations to yield
    different descendant chromosomes.
  • If a genetic variant marked by the X on the
    ancestral chromosome increases the risk of a
    particular disease, the two descendants who
    inherit that part of the ancestral chromosome
    will be at increased risk.
  • Adjacent to the variant marked by the X are many
    SNPs that can be used to identify the location of
    the variant.
  • Haplotype A particular combination of alleles
    along a chromosome that tends to be inherited as
    a unit.

http//www.hapmap.org/originhaplotype.html
50
Assessing Polymorphisms Linkage Disequilibrium,
Haplotype Block, and Tag SNPs
Adapted from Nature 426, 6968 789-796 (2003)
  • Linkage Disequilibrium (LD) If two alleles tend
    to be inherited together more often than would be
    predicted, then the alleles are in linkage
    disequilibrium.
  • If most SNPs have highly significant correlation
    to one or more of neighbors, these correlations
    can be used to generate haplotypes, which
    represent excellent proxies for individual SNP.
  • Because haplotypes may be identified by a much
    small number of SNPs (tag SNPs), assessing
    polymorphisms via haplotypes dramatically reduces
    genotyping work.

51
Assessing Polymorphisms Tag SNPs
  • Tag SNP a representative SNP enabling to infer
    (or predict) other SNPs of its neighborhood
    (both distance and genealogically wise).
  • An r2 of 0.8 or greater is sufficient for tag SNP
    mapping to obtain a good coverage of untyped
    SNPs.
  • Tag SNPs allow genotyping of a lower number of
    marker SNPs with very small losses in power.
  • If LD between SNPs is low, almost every SNP might
    have to be genotyped to get all variation
    information.

51
52
  • Goals
  • Create a public genome-wide database of common
    human genetic variation in the context of
    geographic distribution
  • Provide such information to guide genetic studies
    of clinical phenotypes
  • Phase I (Oct. 2002)
  • One million common SNPs (every 5 kb across the
    genome) were genotyped in 269 DNA samples from
    four populations.
  • Common SNPs Minor Allele Frequency 0.05
  • YRI Yoruba in Nigeria (30 trios), CEU Utah
    with European ancestry (30 trios), CHB 45 Han
    Chinese, JPT 44 Japanese
  • Phase II
  • An additional 4.6 million SNPs are genotyped.
  • ENCODE (Encyclopedia of DNA Elements)
  • Collection of ten regions, each 500kb in length.
  • Each 500 kb region was re-sequenced and all SNPs
    were genotyped.

53
HapMap Progress
  • PHASE I completed
  • 1,000,000 SNPs successfully typed in all 269
    HapMap samples
  • At least one common SNP every 5 kb across the
    genome
  • ENCODE variation reference resource available
  • PHASE II data generation complete, about 4.6
    million SNPs typed in total.
  • ENCODE-HAPMAP A much more detailed variation
    resource
  • 48 samples sequenced
  • All discovered SNPs (and any others in dbSNP)
    typed in all 270 HapMap samples
  • Current data set average 1 SNP every 279 bp

54
HapMap Data Overview
Basic Data genotypes of the 270 individual
samples (frequencies of SNP alleles and genotypes
in each population) Recent data release (Full
Data Set) January 11, 2007, NCBI B35 (includes
both Phase III data, genotypes from Illumina
100k and 300k genotyping arrays and the
Affymetrix nsSNPs) Phase I 600,000 common SNPs
in 270 individuals Phase II 4-5 million SNPs in
the same individuals
  • Available for bulk download
  • All genotype data, haplotype phasing data (from
    PHASE)
  • Pedigree trio files
  • Raw LD data (D, R2), recombination rates and
    hotspots
  • Allele and genotype frequencies
  • SNP assays and protocols
  • Allocated SNPs (dbSNP reference clusters chosen
    for genotyping)

Adapted from Alanna Morrison, Human Genetics
Center, Feb. 2007 lecture
55
Major Findings of the HapMap Project
  • Extensive Redundancy of SNP over 90 of all SNPs
    on the map have highly statistically significant
    correlation to one or more neighbors.
  • Confirmed the generality of recombination
    hotspots and long segments of strong LD
    (Haplotype blocks), with the average length
    ranging from 7.3 (YRI) to 16.3 kb (CEU), and
    between 65-85 of human genome presented in such
    blocks.
  • Revealed limited haplotype diversity while each
    haplotype block contains 30-70 SNPs, on average
    only 4-5.6 common haplotype blocks exist, which
    can be further identified by a smaller number of
    SNPs (tag SNPs).
  • The density of common SNPs can be reduced by
    7590 with essentially no loss of information.
    That is, the genotyping burden can be reduced
    from one common SNP every 500 bp to one SNP every
    2 kb (YRI) to 5 kb (CEU and CHB/JPT).

56
What can you do from the HapMap Web Site?
  • Search for SNPs in a gene or any region of
    interest (ROI).
  • View patterns of LD in the ROI.
  • Select tagSNPs in the ROI.
  • Download information on the SNPs in ROI for
    genotype/haplotype data analysis and
    visualization in Haploview or other software.
  • Generate and retrieve customized subset data.
  • Download the entire data set in bulk.

57
Search HapMap Example 1
SNPs in human BRCA1 gene have been reported to be
involved in the early onset of breast cancer.
Find all available genotype and LD data for SNPs
documented for BRCA1 in HapMap database.
http//www.hapmap.org/
58
HapMap Search Example 1Step 1 Open the Genome
Browser with the Latest Full Data Set
Click HapMap Genome Browser (B35 full data set)
59
HapMap Search Example 1Step 2 Specify the
landmark/region of interests
Enter gene name brca1 to specify the region of
your interest
When there are multiple transcripts, click one of
your choice
60
HapMap Search Example 1Step 3 Examine and
determine the desired region for display
The mRNA
Examine the region for display using different
scales
Genotype frequency
Genotyped SNPs in the region, pie chart shows
allelic frequencies (ref vs other)
61
HapMap Search Example 1Step 4 display genotype
data for each refSNP
62
HapMap Search Example 1Step 5 Select the
desired tracks for display
Select the desired analysis results for display
Click Update Image once the configuration is
done
63
HapMap Search Example 1Step 6 Configure the tag
SNP Picker
Select the desired population
Select the desired tagging methods
Select r2 value to set desired stringency
Set MAF for the lowest threshold of alleles to be
captured by the tagged SNPs
Specify SNPs to be included/excluded as tagged
SNPs
64
HapMap Search Example 1Step 7 Configure the LD
Plot
Configure LD plot display
Select LD measurement and range
Customize the color display for LD value
Select desired populations
65
HapMap Search Example 1Step 8 Tag SNPS and LD
Plot
Genotyped SNPs in the region
LD plot shows LD between different pairs of SNPs
Tagged SNPs based on your criteria
66
HapMap Search Example 1Step 9 Download various
data and files
Click Go
The genotype data can be used for in depth LD and
Haplotype analysis with the free Haploview
program.
Select desired data or file for download
67
Haploview-- http//www.broad.mit.edu/mpg/haploview
/
68
Haploview Screenshots
69
HapMap Data Extraction using HapMart
Select desired population
www.hapmap.org
70
HapMap Data Extraction using HapMartData filter
and export
71
Perlegen Sciences
  • Found in 2000 with the mission of identifying
    clinically relevant patterns of genetic
    variation.
  • Over 1.6 millions common SNPs genotyped from 71
    individuals from 3 American populations of
    European, African and Asian ancestry (about 1
    SNP/1871 bp)
  • GWA studies on over 100,000 different human
    individual.
  • Re-sequenced the nuclear DNA genomes of 15 inbred
    laboratory mouse strains and generated genotype
    data.
  • Specialized Mouse Genome Brower allows users
    visualize the SNPs and LR-PCR primer pairs and
    access the SNP genotypes for the 15 strains
  • http//mouse.perlegen.com/mouse/browser.html

72
Perlegen Human Genotype Brower
http//genome.perlegen.com/cgi-bin/gbrowse/
73
Perlegen Human Genotype Brower
74
  • Hosting raw genotyping data for 4.5 million human
    SNPs from HapMap, Perlegen, and other projects.
  • Generated SNPs data on candidate genes involved
    in cardiovascular diseases and inflammatory
    process.
  • Tools for searching, visualization and analysis
    of genotype data for association studies.
  • Merging SNP data sets from different populations.

75
Using Genome Variation Server
http//gvs.gs.washington.edu/GVS/index.jsp
Select the search type to start the search
upload your genotype data for analysis
Detailed online tutorial
76
GVS Search Example rs9939609 (FTO gene)
  • Step 1 select query type

1
2
77
GVS Search Example rs9939609 (FTO gene)
  • Step 2 Select population(s)

78
GVS Search Example rs9939609 (FTO gene)
  • Step 3 Configure parameters

79
GVS Search Example rs9939609 (FTO gene)
  • Step 4 Display ResultsGenotype data

80
GVS Search Example rs9939609 (FTO gene)
  • Step 4 Display ResultsGenotype data

rs9939609
SNP ID
Sample
81
GVS Search Example rs9939609 (FTO gene)
  • Step 5 Display resultsTagSNPs

TagSNPs Table Display
82
GVS Search Example rs9939609 (FTO gene)
  • Step 5 Display resultsTagSNPs

Bin
TagSNPs Graphic Display
83
GVS Search Example rs9939609 (FTO gene)
  • Step 6 Display resultsLD

84
GVS Search Example rs9939609 (FTO gene)
  • Step 7 Display resultsSummary

85
SNPs in Ensembl
http//www.ensembl.org/index.html
  • Most SNPs imported from dbSNP (rs)
  • Imported data alleles, flanking sequences,
    frequencies, .
  • Calculated data position, synonymous status,
    peptide shift, .
  • For human also
  • HGVbase
  • TSC
  • Affy GeneChip 100K and 500K Mapping Array
  • Ensembl-called SNPs (from Celera reads)
  • For mouse and rat also
  • Sanger- and Ensembl-called SNPs

86
SNPs in Ensembl
MapView SNP density on chromosome
87
SNPs in Ensembl
ContigView SNPs in genomic context
88
SNPs in Ensembl
GeneSeqView SNPs in genomic sequence
89
SNPs in Ensembl
TransView ProtView SNPs in transcript/ protein
90
SNPs in Ensembl
What SNPs does my gene contain? gt GeneSNPView
91
SNPs in Ensembl
  • Info about one specific SNP?
  • gt SNPView
  • SNP Report
  • Genotype and allele frequencies per population
  • Located in transcripts
  • SNP Context
  • Individual genotypes

92
https//www.pharmgkb.org/index.jsp
93
(No Transcript)
94
User Question
A recent report (Frayling et al. Science 2007)
found a common variant (rs9939609, AgtT) in the
FTO gene (fat mass and obesity associated) is
associated with body mass index and predisposes
to obesity and diabetes. The adults (16)
carrying homozygous risk allele A weighed 3 kg
more and had 1.67 fold increased odds of obesity
compared to those without the risk allele. Use
the HapMap and dbSNP to find the genotype data of
this SNP in different populations.
95
Answer 1 Searching HapMap
Use the refSNP (must starts with rs) as the
landmark for the search
Click on the pie chart for detailed population
genotype data
96
Answer 1 Searching HapMap
Population genotype data of the homozygous risk
allele A
Retrieve detailed genotyping data
97
Answer 2 Searching NCBIs dbSNP
http//www.ncbi.nlm.nih.gov/sites/entrez?dbSnp
Click on the rs record for detailed SNP data
report
98
Answer 2 Searching NCBIs dbSNP
Genotype data from Perlegens project with
different population samples
99
Acknowledgement
  • In addition to those already stated, some slides
    of this workshop were adapted from the sources
    below
  • Chattopadhyay A. and M.R. Tennant. Genetic
    Variation Resources. Lecture slides for 2007
    NCBI Advanced Workshop for Bioinformatics
    Information Specialists.
  • Stein L. Using HapMap.org A tutorial.
    Presentation slides as part of the Official
    HapMap Tutorial.
  • Overduin B. Sequence Variation in Ensembl.
    Lecture slides for Ensembl Courses and Workshops

100
Recommend Topics for the Second Part of Online
Resources for Genetic Variation Study
  • Functional analysis of SNPs
  • Tools for SNP discovery and genotyping
  • Tools for TagSNPs selection
  • Tools for genome wide association study
  • Genetic association databases
  • Others??

101
Please evaluate this workshop to help me
improving future presentations http//www.zoomera
ng.com/survey.zgi?pWEB226GJV4RJWR Have
questions or comments about this workshop?
Please contact Yi-Bu Chen, Ph.D. Bioinformatics
Specialist Norris Medical Library University of
Southern California 323-442-3309 yibuchen_at_belen.h
sc.usc.edu
Write a Comment
User Comments (0)
About PowerShow.com