Doug Raiford - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Doug Raiford

Description:

... are matching a sequence to its most probable structure * Protein Structure Searches * HMMR Can t really align How else might it work? – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 21
Provided by: doug3161
Category:
Tags: doug | hmmr | raiford

less

Transcript and Presenter's Notes

Title: Doug Raiford


1
Protein Structure Searches
  • Doug Raiford
  • Lesson 18

2
Problem definition
  • Given a protein conformation can we find other
    structurally similar proteins?
  • Might have a database of structures (like the PDB)

3
If have a predicted and known
  • Can do a simple RMSD to compare the two
    conformations
  • Know precisely which aas compare to which

4
What about if not identical sequences?
  • Must map aas from one to aas in the other
  • How might you do this?
  • Sequence similarity
  • MSAs

5
Have we seen before?
  • 3D PSSM
  • Sequence alignment integrated with 3D alignment
  • Stored in profile (position specific similarity
    profile)
  • Gens 1D profiles first (MSAs)
  • Then uses a structural alignment program (SAP) to
    augment profiles with structural similarity

6
SAP (structural alignment program)
  • Aligning secondary structures

7
How?
  • What do you think of when you hear that you will
    need to align two things?
  • Dynamic programming

a a a T ß ß ß
a
a
ß
ß
ß
ß
8
Scoring
  • Three components
  • AA similarity (substitution matrix)
  • Local structure
  • E.g. both aas members of alpha helix
  • Solvent exposure

Are the associated AAs similar, sequence wise
(i.e. both glycines)?
Are they both in a similar local structure?
Are they both buried or both exposed to solvent?
9
Benefits
  • SAP (structure alignment) allows a profile to be
    influenced by secondary structure
  • Useful to 3D PSSM in thatthreading decisions
    (whichaas match to a profile)
  • Homology based protein conformation enhancedby
    making better decisions on where to insert
    gaps/varying length loops

10
Another already seen
  • PFAM
  • Have Markov Models for protein families
  • Sequences that match models have high probability
    of matching conformation
  • Even though not comparing structures (query to
    target)
  • are matching a sequence to its most probable
    structure

Pfam
HMMR
11
What about similar structure in an alternative
way?
  • Cant really align
  • How else might it work?

12
Dali (distance matrix alignment)
  • How might two distance matrices look?
  • All pair wise distances from each aa to all other
    aas
  • If identical proteins the matrices would be
    almost identical

Low distance region in matrix if parallel
13
How turn into a similarity score?
  • Find optimum set of similar sub-structures
  • Even if in different 1D locations
  • Find amino acid equivalence
  • Once have equivalence can easily compare
    structure similarity
  • E.g. with RMSD

14
Approach
  • Break matrix into a bunch of overlapping
    sub-matrices
  • Do an all pair wise comparison
  • Sub-matrices are merged that naturally extend
  • Must find pairings of sub-matrices that yield
    best overall score

1 2 3 4 5 6 7
1
2
3
4
5
6
7
1 2 3 4 5 6 7
1
2
3
4
5
6
7
15
How optimize choice of pairings
1 2 3 4 5 6 7
1
2
3
4
5
6
7
  • Monte Carlo approach
  • Randomly generate pairings
  • Calculate overall similarity
  • Multiple solutions in parallel
  • Slowly improve each by randomly altering pairings
    (like a random search)
  • Have some probability of keeping a solution that
    is worse than previous

1 2 3 4 5 6 7
1
2
3
4
5
6
7
16
Once have aa associations
  • Can determine similarity
  • How?

17
Have to minimize aa distances
  • Must perturb XYZ (translation), pitch, and yaw
    (rotation) of one of the proteins minimizing RMSD
  • Like linear regression
  • Cant do until know which aas are associated

18
Have to minimize aa distances
  • Some numeric methods start by fixing between 2
    and 4 amino acids
  • Some short cuts
  • Center of gravity is the average of all vectors
  • Translate
  • ave(p1) ave(p2)
  • Singular value decomposition to rotate (Like
    Eivenvectors)

19
(No Transcript)
20
Score more complex so
  • Requires double dynamic programming
  • If nxm matrix then n times m different matrices
    generated pinning return path to each aa pair
  • Used to generate a position specific scoring
    which is then used in aa similarity scoring
  • Reduces the constraint that two particular aas
    are equivalent

a a a T
a
a
ß
a a a T
a
a
ß
a a a T
a
a
ß
a a a T
a
a
ß

a a a T
a
a
ß
Write a Comment
User Comments (0)
About PowerShow.com