People are different - PowerPoint PPT Presentation

About This Presentation
Title:

People are different

Description:

PSIC profile matrix values derived from multiple alignment with homologous proteins ... Bos Taurus trypsin [PDB ID :1ql7] III. 3D structure analysis ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 50
Provided by: naka6
Category:

less

Transcript and Presenter's Notes

Title: People are different


1
??????????? ?????? ????????????-???,
15.04.06
??????? ?????????, ???????? ???????????? ????????
??. ???????????? ??? , ??????
2
People are different
3
and so are their genomes
4
???????????
SNP (single nucleotide polymorphism)
????????????? ? ????????? ?? ????? ? ??? ??
??????? ???????? ??? ???? ???????????? ?????????
? ???????? ????? ??????? ???????? (??????) 1
NaNg N, Na/N 0.01, Ng/N 0.01
5
??????????? ? ???????????
  • ???? ???? ? ????????? ??????????????????? ??????
    ????. ????
  • ????? ??????????? ?? ????? ? ??????? ?????
    ?????????????? ????? (?.????????, ??????
    ?????????)
  • ? ????????? ???? ??? ????????????? ???? ?????
    ????????????? ?????? ????????? (?.?. ??????????
    ??? ??? ??????? ????? ???????)
  • ??????????? ????????????? ??????????? ?????????
    ?????? ? ?????????(-??), ??? ? ??????? ????????
    ???? ????????

6
???? ???????????? ? ??????
???????????????? (SNP) ????????
???????/??????? ???????????????? ??????
????????? ????? (VNTR, variable number tandem
repeat) ??????? ??????? ?????????????
???????????? (MNP)
7
????????? ???????? SNPs
  • Comprise the 90 of human genetic variation
  • Occur with an average density 1/600 bp
  • Transition C?T(G?A) occurs at 2/3 of all cases,
    three transversions C?A (G?T), C?G(G?C), T?A(A?T)
    in 1/6 of all cases each
  • Most of them (85) are common to all
    populations (with differing allele frequencies)

8
Why SNPs are important?
  • Convenient genetic markers
  • Responsible for existence of various phenotypes,
    with primary interest in disease ones
  • Pharmacogenomics individual response to drugs
  • Clues to understand human evolution

9
SNP ? ?????? ????????
10
????????????? SNP ?? ????????? ? ??????
1. ???? 1.1 UTR 1.2 ?????? (cSNP) 1.2.1
????????????(sSNP) 1.2.2 ??????????????
(nsSNP) 1.3 ??????? 1.4 ????? ?????????? 2.
???????????? ??????? ????? (rSNP) 3. ?????????
???????
11
Synonymous vs. non-synonymous SNPs
Example Lysosomal alpha-glucosidase precursor
(SwissProt P10253)
Hypothetical SNP C ? T
HGVBase ID SNP000003023 G ? C
CAC CAG CTC CTG TGG GGG GAG GCC CTG CT CAC CAG
CTC CTG TGC GGG GAG GCT CTG CT
nsSNP Trp746?Cys
sSNP Ala749?Ala
12

Summary of Annotation on human Genome Build 33 dbSNP Build 124 Summary of Annotation on human Genome Build 33 dbSNP Build 124 Summary of Annotation on human Genome Build 33 dbSNP Build 124 Summary of Annotation on human Genome Build 33 dbSNP Build 124
FUNCTION CLASS CODE SNP COUNT GENE COUNT FUNCTIONAL
FUNCTION CLASS CODE SNP COUNT GENE COUNT CLASSIFICATION
1 338787 26210 Locus region
3 39214 14342 Allele synonymous to contig nucleotide
4 50772 15710 Allele nonsynonymous to contig nucleotide
5 546965 17898 untranslated region
6 2925773 19332 intron
7 832 769 splice site
8 89554 18655 Allele is same as contig nucleotide
9 7111 1006 Coding synonymy unknown
13
????????? ???? SNP (?? MillerKwok, 2001)
  1. ????????? ?????? ?????????? ???????? ?????
    ??????? (100 ??????? ?? ??????????)
  2. ????????? ?? ??????? ????????? ????????? ??
    ????? ??????
  3. ????????? ?????????? ??????? ? ?????????
  4. ???????? ?????? ?????? (0 vs. 100), ???????????
    ? between-species difference

14
?????????
????????? ???? ????????? ???? SNP ???????? 0.3
??? ???. ???????????, ??? ?????????? ???????? ?
???????? ????????? 5 ??? ??? ?????, ? ?????
H.sapiens ?? ?????? ? ?????????? ?????????
????????? 0.1-0.2 ??? ??? ?????, ???????
?????????? (?) ?????????? SNPs ? ???????? ?
?????? ?????, (?) private SNP, ?.?.
?????????????? ? ???????? ????? ????????????
?????????
15
Why polymorphisms are maintained in the
population?
  • Selectionists because heterozygotes have higher
    fitness
  • Neutralists because all observed polymoprhisms
    are selectively neutral
  • - - - - - -- - - - - - - - - - - - - - - - - - -
    - - - - - -
  • Reality is always somewhat more complicated

16
Why SNPs are important?
  • Convenient genetic markers
  • Responsible for existence of various phenotypes,
    with primary interest in disease ones
  • Pharmacogenomics individual response to drugs
  • Clues to understand human evolution

17
nsSNPs vs. disease mutations
  • Disease mutations are rare (ltlt1) and usually
    cause monogenic diseases (e.g., cystic fibrosis)
  • nsSNPs are frequent (gt1) and can modify risks of
    major common (multigenic, complex) diseases
    (e.g., cancer, cardiovascular disease, mental
    illness, autoimmune states, diabetes)
  • In some cases, however, it is difficult to make a
    distinction

18
Some common nsSNPs are known to affect critical
structure features
Frequency of the haemochromatosis allelic variant
of HLA-H protein Cys260Tyr (with destroyed
disulphide bond) is up to 6 in Northern Europe
19
Application area for prediction methods
  • Genetics of complex diseases
  • Analysis of human birth defects
  • Genetics of rare developmental phenotypes
    (analysis of de novo mutations that cannot be
    mapped by genetic techniques)
  • Genetics of model organisms (identification of
    genes involved in diverse processes by
    mutagenesis screens)
  • Genomics and evolutionary genetics (e.g.,
    quantifying selective pressure)

20
Identifying SNPs responsible for complex
diseases general strategies
  • whole genome scan hypothesis free approach
    extraordinary number of candidate SNPs
  • candidate gene studies requires a priori
    models nevertheless, large numbers of candidate
    SNPs must be tested

21
Identifying SNPs responsible for complex
diseases application
1. A SNP with established association need not be
functional therefore, in silico expertise is
required for selection of potentially functional
SNPs 2. Detection of enrichment of rare
potentially functional alleles in the disease
population (plasma levels of HDL-cholesterol,
hypertension, colorectal cancer)
22
Methods for prediction of effect of nsSNPs
  • Sequence-based methods analysis of multiple
    alignment with homologs Ng-Henikoff 2002
  • Structure-based methods analysis of various
    structural parameters Wang, Moult 2001
    Chasman, Adams 2001
  • Combined methods sequence and structure
    analysis Sunyaev,Ramensky,Bork 2000, 2001, 2002

23
PolyPhen prediction of amino acid substitution
effect on protein function
Prediction benign (neutral), damaging
(deleterious)
24
PolyPhen prediction of amino acid substitution
effect on protein function
  • Data sources
  • Sequence annotation of the query protein
  • PSIC profile matrix values derived from multiple
    alignment with homologous proteins
  • Structural parameters and contacts of query
    protein structure or its gt50 homolog

Prediction benign (neutral), damaging
(deleterious)
25
I. Sequence annotation
Hereditary hemochromatosis protein precursor
(HLA-H, Q30201)
Features checked bond DISULFID, THIOLEST,
THIOETH site BINDING, ACT_SITE, LIPID, METAL,
SITE, MOD_RES, SE_CYS region TRANSMEM, SIGNAL,
PROPEP
26
II. PSIC profile analysis of homologous sequences
  1. Align with homologous proteins with seq. ide.
    30..94

27
II. PSIC profile analysis of homologous sequences
2. Calculate the profile matrix with PSIC
algorithm
Profile matrix Sa,j ln pa,j / qa , a
1,..20, j 1,..N, N alignment length
28
II. PSIC profile analysis of homologous sequences
3. Analyse difference between profile scores for
two a.a. variants
Asn?Cys ? SAsn,4 SCys,4 1.591
29
III. 3D structure analysis
1. Residues that are in spatial contact with a
ligand or other critical residues
Zen 999
Bos Taurus trypsin PDB ID 1ql7
residues in 5Å contact with Zen 999
30
III. 3D structure analysis
2. Residues that form the hydrophobic core of the
protein (buried residues)
Surface residues Buried residues
Bos Taurus trypsin PDB ID 1ql7
31
Structural parameters and contacts
  • Secondary structure
  • Phi-psi dihedral angles
  • Solvent accessible surface area, normed s.a.s.a
  • Change in accessible surface propensity
  • Change in residue side chain volume
  • Contacts with heteroatoms
  • Interchain contacts
  • Contacts with functional sites (BINDING,
    ACT_SITE, LIPID, and METAL)
  • Region of the phi-psi map (Ramachandran map)
  • Normalised B-factor (temperature factor)

32
(No Transcript)
33
Validation control sets
  • all dam unknown dam/(damben)
  • Disease mutations
  • Strict set 444 366 3 82.9
  • Total 2,782 2,047 70 75.4
  • Between species substitutions
  • Total 671 58 5 8.7

34
Validation case studies
  • APEX1 protein 24 out of 26 substitutions
    predicted correctly (Xi et al.)
  • Plasminogen activator inhibitor-2 18 out of 20
    (Di Guisto et al.)
  • 3 HapMap populations and 10 primate species
    analysis of 27,000 nsSNPs with frequencies
    (Victoria Carlton, AFFYMETRIX, private
    communication)

35
Validation allele frequency
36
Validation nsSNPs vs. human-mouse interspecies
variation
37
PolyPhen predictions for dbSNP b.121
Ivan Adzhubei, 2004
  • All
  • 9,502 unknown
  • 27,991 benign...............67.6
  • 7,905 possibly damaging....19.1
  • 5,521 probably damaging....13.3
  • 50,919 total (44,005 unique rss)
  • With structure
  • 42 unknown
  • 2,142 benign...............57.1
  • 531 possibly damaging....14.2
  • 1,076 probably damaging....28.7
  • 3,791 total (,167 uniqe rss)

38
PolyPhen predictions for dbSNP b.121
Ivan Adzhubei, 2004
  • All
  • Filtered ?5 seq. in multiple alignment
  • 16,813 benign...............64.2
  • 5,195 possibly damaging....19.8
  • 4,168 probably damaging....15.9
  • 26,176 total (21,677 unique rss)
  • With structure
  • Filtered ?5 seq. in multiple alignment
  • 2,021 benign...............56.6
  • 499 possibly damaging....14.0
  • 1,050 probably damaging....29.4
  • 3,570 total (2,983 unique rss)

39
Hydrophobic core stability parameters are the
best predictors
Ramensky et al., Nucleic Acids Res. (2002)
303894-90
40
PolyPhen http//www.bork.embl.de/PolyPhen
  • PolyPhen input
  • Protein identifier OR sequence
  • Substitution position
  • Substitution type

41
PolyPhen http//www.bork.embl.de/PolyPhen
42
PolyPhen nsSNPs data collection
43
Transphyretin (PDB 1tyr, SNP000012365) Thr118 ?
Asn occurs at the ligand (REA) binding site
Thr 118
REA 130
DAMAGING nsSNPs
44
Trypsin (PDB 1trn, SNP000012965) Ser142?Phe
results in the strong side chain volume change at
a buried position
Ser 142
DAMAGING nsSNPs
45
Damaging nsSNPs
  • We estimate that 20 of non-synonymous cSNPs
    from databases are damaging
  • Average allele frequency of non-synonymous cSNPs
    predicted to be damaging is twice lower than for
    benign non-synonymous cSNPs
  • We propose to use these predictions for
    prioritisation of candidates for association
    studies

46
Development directions
  • Better multiple alignment pipeline
  • Compensated nsSNPs
  • Non-globular structural regions
  • Non-coding SNPs

47
An example of compensated pathogenic deviation
48
Polyphenism the ability of a single genome to
produce two or more alternative morphologies
within a single population in response to an
environmental cue (such as temperature,
photoperiod, or nutrition). Dr. Ehab Abouheif,
McGill University, Montréal Québec
The seasonal morphs of the buckeye butterfly,
Precis coenia (Nymphalidae). The ventral surfaces
are shown. The Summer morph ("linea") is on the
left the Fall morph ("rosa") is on the right.
Scott F.Gilbert, A Companion to Developmental
Biology. Chapter 22, Seasonal Polyphenism in
Butterfly Wings
49
People
Shamil Sunyaev(1), Vasily Ramensky(2), Steffen
Schmidt(1), Ivan Adzhubei(1) (1) Division of
Genetics, Department of Medicine, Brigham and
Womens Hospital, Harvard Medical School, Boston,
USA) (2) Engelhardt Institute of Molecular
Biology Moscow Russia)
Peer Bork, Yan P. Yuan (European Molecular
Biology Laboratory, Heidelberg, Germany)
Write a Comment
User Comments (0)
About PowerShow.com