Lee, Juyong - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Lee, Juyong

Description:

OTL Squence-Structure alignment Dali server or TMalign ~ In Conventional seq.-str. alignment Linear sum of similarities of properties Functions for ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 21
Provided by: 6649918
Category:
Tags: juyong | lee | squence

less

Transcript and Presenter's Notes

Title: Lee, Juyong


1
About BoostThreader
  • Lee, Juyong
  • 2009. 08. 26

2
What is BoostThreader?
  • A Sequence-Structure threading program
  • Published by J. Xus group
  • Known to be good for hard cases
  • Does not work for me

3
Lets thread!
  • ???
  • sequence
  • protein structure
  • scoring function
  • algorithm

Deletion
Match
F
C
D
E
B
G
A
BAD
Good
4
Three algorithms for Alignment!
Im your father
Im Andrei Andreyevich Markov.
  • Generative model
  • Traditional
  • Hidden Markov Chain
  • Not that old
  • Conditional Random Field
  • Up to date
  • Dynamic programming

5
Dynamic programming
  • Finding the best scoring path on the alignment
    matrix

Final
Initial
The alignment!
6
Dynamic programing
  • Finding the best scoring path on the alignment
    matrix

Initial
Final
The path
The alignment!
7
More about Dynamic Programming
Follow the maximum scoring path!
SEQUENCE
i i1
j j1
deletion
A ?
g Gap penalty -1
g
match
f
insertion
STRUCTURE
h
A a
F(i1, j1)
? a
h Gap penalty -1
8
??? ???? ???
  • More specific!
  • ??? ?? ? ?? ???? ???
  • What is ???
  • ??? ??? ??? ?? alignment OTL
  • Squence-Structure alignment? ?? Dali server or
    TMalign ??

9
In Conventional seq.-str. alignment
  • Linear sum of similarities of properties
  • Functions for Match and Gap cases are only
    needed!
  • Fmatch w1predicted SS real SS
    w2predicted SA real SA
    w3predicted residue depth real depth
  • Fgap Opening penalty of gaps Extension
    penalty
  • Only consider next step!

10
Whats different in BoostThreader?
  • Dependent on the current and next step both!
  • Nine scoring functions are necessary!
  • Gap penalty is context-dependent
  • Trained from reference alignments!
  • DALI, TMalign etc
  • Regression Trees are used as scoring function
  • Not Linear function!

11
Regression Tree? ? ????
12
???????
Hey nature, Not all flies are not Drosophilia
13
Regression Tree!
100?? ???
Training!
1500cc? ????
???
?
5?? ?????
???
?
20?km?? ?????
???
?
?? 8???
?? 5???
?? 15???
?? 11???
14
Example in Threading
Sequence predicted properties Structure
observed properties
SS ? ????
???
?
SA ??? ????
???
?
SA ??? ????
???
?
?? 0.1 10? ?? 1?
?? 0.3 10? ?? 3?
?? 0.6 10? ?? 6?
?? 0.9 10 ??? 9?
Estimate Prob. from examples
15
Advantage of Tree
  • Fast
  • Interaction between variables can be easily
    considered

16
Whats really happening in BoostThreader?
  • Initial Setting
  • Set all F0 (u?v,seq(i),str(j)) 0
  • P exp(F)
  • 30 ?? ?? Sequence-Structure alignment!
  • Calculate Prob. of all possible state transition!
  • Probabilities of all examples!
  • Forward-backward algorithm

17
All Possible Transitions?
For M?M
ABDE a b c d mmimd
AB ab
AB bc
AB cd
Generate examples!
BD ab
BD bc
BD cd
DE ab
DE bc
DE cd
18
Examples(2)
For M?I
ABDE a b c d mmimd
B- ab
B- bc
B- cd
Generate examples!
A- ab
A- bc
A- cd
D- ab
D- bc
D- cd
E- ab
E- bc
E- cd
19
Inside BoostThreader
  • Examples and their probabilities
  • Calculated with the current scoring functions
  • Modify Scoring Functions
  • ???? F? ??! F1F0 (1 P )
  • ???? F? ??! F1F0 - P
  • Add trees until prediction quality doesnt
    increase
  • FF0F1F2F3F4F5

20
Performance
21
Summary
  • BoostThreader considers Current and Next step
  • Scoring function consists of Regression Trees
  • Trees are trained based on Examples

22
?????!
Write a Comment
User Comments (0)
About PowerShow.com