HW Clarifications - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

HW Clarifications

Description:

Prediction of functional/structural sites in a protein using conservation and hyper-variation ... UUU CUU (Phe Leu): non-synonymous. 31 ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 52
Provided by: tal8
Category:

less

Transcript and Presenter's Notes

Title: HW Clarifications


1
HW Clarifications
Identity and Homology
  • Homology implies shared ancestry
  • Partial sequence identity does not necessarily
    imply homology
  • A high coverage of sequence identity can imply
    homology

2
HW Clarifications
Insertions and Deletions
3
Prediction of functional/structural sites in a
protein using conservation and hyper-variation
(ConSeq, ConSurf, Selecton)
4
Empirical findings ofconservation variation
among sites
Functional/Structural sites evolve slower than
nonfunctional/nonstructural sites
5
Conservation functional/structural importance
6
Histone 3 protein
7
Alignment pre-pro-insulin
Xenopus MALWMQCLP-LVLVLLFSTPNTEALANQHL Bos
MALWTRLRPLLALLALWPPPPARAFVNQHL
. .. . Xenopus
CGSHLVEALYLVCGDRGFFYYPKIKRDIEQ Bos
CGSHLVEALYLVCGERGFFYTPKARREVEG
Xenopus
AQVNGPQDNELDG-MQFQPQEYQKMKRGIV Bos
PQVG---ALELAGGPGAGGLEGPPQKRGIV
.. Xenopus
EQCCHSTCSLFQLENYCN Bos
EQCCASVCSLYQLENYCN
.
8
(No Transcript)
9
(No Transcript)
10
Conservation based inference
  • Conserved sites
  • Important for the function or structure
  • Not allowed to mutate
  • Slow evolving sites Low rate of evolution
  • Variable sites
  • Less important (usually)
  • Change more easily
  • Fast evolving sites High rate of evolution

11
Detecting conservation Evolutionary rates
  • Rate distance/time
  • Distance number of substitutions per site
  • Time 2years (doubled because the sequences
    evolved independently)

d
12
Rate computation
MSA
Phylogeny
Evolutionary Model
13
http//conseq.tau.ac.ilSite-specific rate
computation tool
14
Locating the active site of Pyruvate kinase
Glycolysis pathway
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Conservation scores
  • The scores are standardized the average score of
    all residues is 0, and the standard deviation is
    1
  • Negative values slowly evolving ( low
    evolutionary rate). conserved sites
  • The most conserved site in the protein has the
    lowest score
  • Positive values rapidly evolving ( fast
    evolutionary rate). variable sites
  • The most variable site in the protein has the
    highest score

Scores are relative to the protein and cannot be
compared between different proteins!!!
19
(No Transcript)
20
(No Transcript)
21
Combining protein structure
  • Each protein has a particular 3D structure that
    determines its function
  • Protein structure is better conserved than
    protein sequence and more closely related to
    function
  • Analyzing a protein structure is more
    informative than analyzing its sequence for
    function inference

22
Conservation in the structure
Protein core structurally constrained - usually
conserved Active site functionally constrained
- usually conserved Surface tolerant to
mutations - usually variable
Active site
Surface
Core
23
http//consurf.tau.ac.il
Same algorithm as ConSeq, but here the results
are projected onto the 3D structure of the
protein

24
The structure-function of the potassium channel
transmembrane region
cytoplasm
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
ConSeq/ConSurf user intervention(advanced
options)
  • Choosing the method for calculating the
    amino-acid conservation scores (Bayesian/Max
    Likelihood)
  • Entering your own MSA file
  • Performing the MSA using (MUSCLE/CLUSTALW)
  • Collecting the homologs from (SWISS-PROT/UniProt)
  • Max. number of homologs (50)
  • No. of PSI-BLAST iterations (1)
  • PSI-BLAST 3-value cutoff (0.001)
  • Model of substitution for proteins
    (JTT/Dayhoff/mtREV/cpREV/WAG)
  • Entering your own PDB file
  • Entering your own TREE file

30
Codon-level selection
  • ConSeq/ConSurf
  • Compute the evolutionary rate of amino-acid sites
    ? the data are amino acids
  • Compute only the rate of non-synonymous
    substitutions

UUU ? UUC (Phe ? Phe ) synonymous UUU ? CUU
(Phe ? Leu) non-synonymous
31
Synonymous vs. non-synonymous substitutions
For most proteins, the rate of synonymous
substitutions is much Higher than the
non-synonymous rate This is called purifying
selection ( conservation in ConSeq/Surf)
32
Synonymous vs. nonsynonymous substitutions
There are rare cases where the non-synonymous
rate is much higher than the synonymous rate
This is called positive (Darwinian) selection
33
Positive Selection
  • The hypothesis
  • promotes the fitness of the organism
  • Examples
  • Pathogen proteins evading the host immune system
  • Proteins of the immune system detecting pathogen
    proteins
  • Pathogen proteins that are drug targets
  • Proteins that are products of gene duplication
  • Proteins involved in the reproductive system

34
Computing synonymous and non-synonymous rates
Phylogeny
Codon MSA
Evolutionary Model
35
Inferring positive selection
  • Look at the ratio between the non-synonymous rate
    (Ka) and the synonymous rate (Ks)

36
Inferring positive selection
  • Ka/Ks lt 1 purifying selection
  • Ka/Ks gt 1 positive selection
  • Ka/Ks 1 no selection (neutral)




37
  • Our evolutionary model assumes there is positive
    selection in the data
  • By chance alone we expect our model to find a few
    sites with Ka/Ks gt1
  • Is this really indicative of positive selection
    or plain randomness?
  • Maybe theres no positive selection after all?

38
Solution statistically compare between hypotheses
  • H0 Theres no positive selection
  • H1 There is positive selection
  • Perform a statistical test to accept or reject H0
  • (likelihood ratio test)

39
Note saturation of synonymous substitutions
Syn.
Nonsyn.
Human and wheat are too evolutionary remote
saturation of synonymous substitutions Pick
closer sequences for positive selection analysis
40
http//selecton.tau.ac.il
41
Selecton input
Codon-level sequences !!!
  • Coding sequences - only ORFs
  • No stop codons
  • If an MSA is provided it must be codon aligned
    (RevTrans)
  • The user must provide the sequences no
    psi-blast option

42
Positive selection in the primateTRIM5a
43
PrimateTRIM5a
TRIM5a from humans, rhesus monkeys, and African
green monkeys are all unable to restrict
retroviruses isolated from their own species, yet
are able to restrict retroviruses from the other
species
TRIM5a is an important natural barrier to
cross-species retrovirus transmission
TRIM5a is in an antagonistic conflict with the
retroviral capsid proteins
TRIM5a is under positive selection
44
Positive selection analysis
45
Positive selection analysis in Selecton
H1
H0
46
Comparing H0 and H1 in Selecton
47
Comparing H0 and H1 in Selecton
48
(No Transcript)
49
Selecton results
50
(No Transcript)
51
Results
Human rhesus swaps at sites 332, 335-340 (SPRY)
significantly elevate human resistance to HIV and
rhesus resistance to SIV
Write a Comment
User Comments (0)
About PowerShow.com