Multivariate Analysis of Protein Polymorphism (MAPP) - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Multivariate Analysis of Protein Polymorphism (MAPP)

Description:

Free energy in alpha helical conformation. Free energy in beta sheet conformation ... (3) are ProPhylER's tree and alignment; weights (2, from Branch-Manager) are used ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 15
Provided by: arend2
Category:

less

Transcript and Presenter's Notes

Title: Multivariate Analysis of Protein Polymorphism (MAPP)


1
Multivariate Analysis of Protein
Polymorphism(MAPP)
  • Purpose and Basics
  • Algorithm Outline and Performance
  • How to view MAPP results in ProPhylER

? All Content Arend Sidow / ProPhylER 2008
2
Multivariate Analysis of Protein
Polymorphism(MAPP)
  • Purpose and Basics

3
The Impact of Amino Acid Variants
Amino Acid Change
Impaired Function of Variant
Phenotype
cSNP
When and Where the protein acts, Dosage, etc.
Protein Structure and Function
Deterministic
MAPPs prediction target How strongly does a
variant amino acid affect the proteins structure
or function?
4
MAPP Concept
  • MAPP addresses specific variants in single
    positions of the protein sequence
  • More specifically, it uses the evolutionary
    variation in single columns of the alignment for
    predictions of the impact of all possible
    variants on structure and function of the protein
  • In contrast to ESF, which considers averages of
    neighboring sites, MAPP focuses on single sites
    and single variants
  • Consider the variation in the two framed columns
    on the left
  • Red-framed column has a lot of variation that
    does not appear to be constrained in obvious ways
  • Blue-framed column has very little variation that
    preserves a certain characteristic (small size of
    side chain)
  • MAPP quantifies the intuition that there are
    significant differences in constraint acting upon
    the red and blue columns, and generates
    predictions of functional impact of variants

5
MAPP in ProPhylER
For each amino acid in each protein ...
... calculate an impact score from the observed
evolutionary variation. The impact score is
converted to a P-value that describes the
confidence that the variant is consistent with
structure or function of the protein. Low
P-values predict highly deleterious substitutions.
For each possible variant ...
6
MAPP Methodology Rationale
  • The observed variation is a sample that reflects
    specific structural or functional constraints on
    that position
  • MAPP quantifies these constraints by converting
    the letter information in each column into
    their corresponding physicochemical values
  • Key concept
  • The conversion allows calculating the mean and
    the variance in each column for each
    physicochemical property
  • The variance is a statistical reflection of the
    tolerated variation
  • The further a potential variant (polymorphism) is
    outside of the variance of the observed data, the
    more likely is it to be deleterious

I I I I I V
V V V A A A
A A A
A 1.8C 2.5D -3.5E -3.5F 2.8G -0.4H
-3.2I 4.5K -3.9L 3.8M 1.9N -3.5P
-1.6Q -3.5R -4.5S -0.8T -0.7V 4.2W
-0.9Y -1.3
7
Multivariate Analysis of Protein
Polymorphism(MAPP)
  • Algorithm Outline and Performance

Stone EA, Sidow A. Physicochemical constraint
violation by molecular missense substitutions
mediates impairment of protein function and
disease severity. Genome Res. 2005, Jul15978-986
8
MAPP Methodology General Outline
  • MAPP uses scales of six important physicochemical
    properties
  • Hydropathy
  • Polarity
  • Charge
  • Volume of side chain
  • Free energy in alpha helical conformation
  • Free energy in beta sheet conformation
  • The property scales are standardized so the
    values from different scales are comparable to
    one another
  • MAPP also decorrelates the scales, which is
    necessary because certain scales (such as
    hydropathy and polarity) are correlated
  • MAPP generates impact scores for all possible
    variants from the observed evolutionary variation
  • MAPP impact scores are converted to P-values,
    which are displayed on the ProPhylER interface
  • The lower the P-value, the higher the chance that
    the substitution will be deleterious for
    structure or function of the protein

9
MAPP Methodology Algorithm
(1) and (3) are ProPhylERs tree and alignment
weights (2, from Branch-Manager) are used to
calculate a column-specific summary of
physicochemical properties (4). (5) The mean
describes the average property, the variance
describes the degree of constraint for each
property. For each possible substitution in the
column, and for each phyisicochemical property,
MAPP generates a score. These scores are
combined in a way that decorrelates the
physicochemical properties (7). The scores are
then converted to P-values (not shown).
10
Test Binary Predictions on Mutation Impact Data
11
Prediction Accuracy for HIV Protease
1
99
sequence position
The amino acid in HIV-1 protease is in blue red
or green boxes show the experimentally tested
sequence variants. Correct MAPP predictions
(variant was reduced in activity, and was
correctly predicted to be so or variant
mutation was fully functional, and was correctly
predicted to be so) are in green. Incorrect MAPP
predictions are in red. This chart is for the
reduced activity accuracy described below, for
which the P-value cutoff was 0.01. For the
magnitude of the decrease in actitivy, the
P-value cutoff was 0.001.
Reduced activity (functional vs. reduced or
dead) 80.4 prediction accuracy Magnitude of
decrease (reduced versus dead) 76.3 prediction
accuracy
12
Multivariate Analysis of Protein
Polymorphism(MAPP)
  • How to view MAPP results in ProPhylER

13
MAPP Track
For each position in the protein, for each
possible amino acid variant, the MAPP display
shows a color for the predicted deleteriousness.
Red is predicted to be deleterious with high
confidence, blue is unlikely to be deleterious,
intermediate colors range.
(Mousing over the fields will show the P-values.
Low P-values are strong predictions for
deleteriousness, high P-values mean the variant
is unlikely to be deleterious.)
14
Physicochemical Property Importance
MAPP analyses also allow inference as to whether
a particular property is important in the given
alignment position.
but in regions of much evolutionary variation ..
For evolutionarily constrained positions ..
.. certain properties stand out as important
.. no property is important
(Shading is proportional to likely importance.
Mousing over the fields will show the P-values.
Low P-values are strong predictions for the
importance of the property, high P-values mean
the property is unlikely to be important.)
Write a Comment
User Comments (0)
About PowerShow.com