Title: Correlated Mutations and Co-evolution
1Correlated Mutations and Co-evolution
2What is Co-evolution (Correlated Mutation)?
- Individual regions of proteins interact
- Regions can be either on the same chain or on
different chains (complexes) - A mutation in one half of the pair induces a
change in the other half of the pair - the tendency of positions in proteins to mutate
co-ordinately Pazos et. al. 1997
3Correlated Mutations Contain Information about
Protein-protein interactions Pazos et. al. 1997
- A possible aid to the docking problem, using
only sequence information - Docking The process by which protein domains
interact with one another ? fitting
4Methodology
- The correlation coefficient
- S is the similarity between residues at the
positions i/j of type k versus l - Arbitrarily chosen cutoff M predicted contacts
(greatest L/2 values) i.e. ML/2
5- The Harmonic Average (Xd)
- Measure of correlatedness
- Pic percentage of correlated pairs with that
distance, Pia for all pairs
6Comparisons of Correlations
7Docking solutions test
- Note larger percentages imply worse performance
- Special mention of 2gcr and 3adk
- sequence information does not seem to be
sufficient to discriminate
8Figure 5 Scatter plot of Xd vs RMS
distance 9pap Hemoglobin 1hbb
9Prediction Hsc70
- Figure 6 predicted contacts of Nt and Ct domains
of Hsc70 - Could be verified experimentally
10Coevolving Protein Residues Maximum Likelihood
and Relationship to Structure. Pollock et. al 1999
- Using size and charge characteristics to define
co-evolution (correlation) - Negative Correlation Correlation due to
differences in charge (and thus also coevolution)
11The Markov process model (simulated evolution)
- Two states, A and a
- Equation 1, the probability of transitioning
state - ? rate parameter
- p equilibrium frequency
12Use of parameters in model
- Basic model for how they simulate evolutionary
steps
13Likelihood Test Characteristic (LR)
- LI and LD maximum likelihood values for
independent and dependent model - Method of determining whether dependence is
statistically significant
14Test of Significance (LR values for change in
parameters)
15Myoglobin
- Used structure of myoglobin compared differences
in sequences - Variety of species used for sequence information
sperm whale 3D protein structure
16LR distributions for myoglobin size and charge
- Note the large negative correlation LR values in
charge
17Co-evolution of Proteins with their Interaction
Partners, Goh et. al. 2000
- Applied to PGK
- Chemokines
18What is PGK?
19Methodology
- Two independent sequence alignments, for N and C
regions, using PSI-BLAST - ClustalW to create distance matrix between
complete domains - To determine correlation, used equation below
- X and Y correspond to domains r a measure of
relatedness between these domains
20PGK correlations
21Chemokines
- Role of chemokines importance in immunity (HIV,
cancer) - Four categories, mean nothing to me
22Clustering of Chemokines
23Clustering of Chemokine receptors