Title: 1990????Cortes ? Vapnik ???
1??????????????????????
- ?????
- ???? ?????
- ????????????????
2???????????
- ???????????????????????
- 1990????Cortes ? Vapnik ???
- ???????????????????????????????????????
- ???????????????????????????????????????????
- ????
- ???????????????????
- ??????????
- ???????????
- ????????
- c.f. Kernel Methods in Computational Biology, MIT
Press, 2004
3 ???????????
- ????????????????(???????)???????????
- ?????????????????????????????
4 SVM????????????
- ?????????????(SVM)
- ???????????????????????(??)???
- ????????????????????????????????????????
5 ????
- ??????????????????????
- F(x) (??????)??????????????
- ???? K(x,y)f(x)f(y)
- x ? y ??????? ? K(x,y)??
6 ??????
- ?????? K(x,y) xy
- ??????? K(x,y) (xy c)d
- RBF???? K(x,y) exp (-x - y2 /2s2 )
- ?????????(????????????)
- K(x,y) tanh (?xy - d)
7 ????????????
- ??????? K(x,y)f(x)f(y)
- Mercer?????? ? ????
- ??????
- ?????? ( x1,x2,,xn ??????)
8 ????????
- ???????????(feature vector)????????????
- ?????????
- ?????? ?????
- ???????? x ????
- F(x) (???, ??, ???, logP,)
- ???????? x,y ?????????
- F(x) ? F(y) ??????
9?????????????????
- SCOP????????????
- ??????
- ??????(???????HMM)
- ?????
- ???????
- ?????
10 ???????????
- ???????????????????(3????)????????????
- ???????????
- ???????????????????45?????
11 ??????? (Fold Recognition)
- ???3???????????????(fold)???
- ????????????????????????(Chotia, 1992)????
12 SCOP??????
- ?????????????????????
- ???????????????
SCOP Root
Class.1
Class.2
?????
Fold.1
Fold.2
?????
Super Family.1
Super Family.2
?????
Family.1
Family.2
Family.3
mkkrltitlsesvlenlekmaremglsksamisvalenykkgq
ispqarafleevfrrkqslnskekeevakkcgitplqvrvwfinkrmrs
13 Super Family ??
- ????? SCOP ????????????????????
Super Family.1
???????
madqlteeqiaefkeafslfdkdgdgtittkelgtvmrslgqnpteaelq
dminevdadgngtidfpefltmmark
Super Family.2
Super Family.3
14????????????
15 ???????????????????
- HMM???????????
- Fisher ???? (Jaakkola et al., 2000)
- Marginalized ???? (Tsuda et al., 2002)
- ???????????????
- Spectrum ???? (Leslie et al., 2002)
- Mismatch ???? (Leslie et al., 2003)
- ?????????????????????
- SVM pairwise (Liao Noble, 2002)
16 ????????
- ?????????????????????
- 2?????3???????????????
- ???????????????(?????)
- ???????????????????(????????)???
17?????(????)
- ???(???????)?????????
- PAM250, BLOSUM45 ??
18???????????
- ???2?????????????????????????
- ??????????????????(????????)??????????O(mn)???????
(m,n???????)
19????????????????(1)(Needleman-Wunsch??????)
- ????????????????
- ???????????????????????
- ????????????
20????????????????(2)
DP (?????)??? ????(???)???
? O(mn)??
???????????? F(m,n)??max??????? F(i,j)???????????
(???????)
21 ??????????(1) (Smith-Waterman??????)
- ???????????????????
- ???????????????
- ?????????????????
- ????HEAWGEH ? GAWED ????
- A W G E
- A W -E
- ????????????
22 ??????????(2)
????? ??
23 LA????
- SW???????????????????
- ? MAX ??????????????
- ?????HMM?????????????
- ???
- SW?????????????HMM???
- SW?????? ???????
- LA????
- ??????????????(????)?
- ???????????O(mn)???LA?????????????
24LA???????(1)
- ??(??)?????? Kaß (x,y)
- ???????? Kgß (x,y)
25 LA???????(2)
- ?????????(convolution)
- ??????????? n ?????????
- LA????
26 LA?????SW??????
- p(????)??????
- S(x,y,p) ??????p????
- ?????????????
??
27 LA?????SW???
- SW??? 1????????????????
- LA???? ????????????????
28 SVM?????????????? (1)
- ?????????????SVM???
- ???????????SVM?????????????
Super Family.1
SVM.1
???????
madqlteeqiaefkeafslfdkdgdgtittkelgtvmrslgqnpteaelq
dminevdadgngtidfpefltmmark
Super Family.2
SVM.2
Super Family.3
SVM.3
29 SVM?????????????? (2)
- ????
- ????????????????????????(??)
- ??
- ?????????????????????????(??)
- ???
- ?????????????????????????????????(??????)
30 ??????????
- LA???????
- 1????O(mn)??(xm, yn)
- ????????? N ?????????? n
- ? ??? O(N2n2) ?? ? ??????
???
31????????
- LA???????
- 1????O(mn)????????????
- ?????????????????????
- 1CPU?????????
- ?????
- SGI ORIGIN 3800 (R14000(500MHz) 256CPU)
- PC???? HPC (2.8GHz Xeom 8CPU)
- ???
- LSF (Load Sharing Facility) ? script ??????
- ????????(????????????CPU???)
- ??????????????
- ????????????????
32????????
33 ROC???????
??????????????
34 mRFP???????
??????????????
35 ??
- ???????????????????????
- Smith-Waterman?????????HMM??????
- ??????????????????????
- ????????????
??
- ?????????(??????)????????????????
36??????????????????
37???????
- ??? G(V,E)
- ????????????????????????
- ??????????(?????????)
- V ????? E ????
- ??????????????????????
- ???????????????????????
- ???????
- ?????? G1(V1,E1) ?G2(V2,E2) ????????
38Marginalized ????
- Tsuda??2002????
- ??
- h,h ??????K????
- ?????RNA?????????
39 Marginalized ???????(1)
- Kashima??2003????
- h ??? G1 ??????
- h ??? G2 ??????
- l(h) ?? h ????(???)??
- K(x,y) ????????????
- (? K(x,y)1 if xy, otherwise 0 )
40 Marginalized ???????(2)
41 Marginalized ???????(3)
42 Marginalized ???????(4)
43 Marginalized ???????(5)
- ??????????
- h ??? G1 ??????
- h ??? G2 ??????
- ??????????????????????????????????????????????????
? - (V1V2V1V2??????????)
44 Marginalized ???????(6)
45 Marginalized ???????(7)
46 Marginalized ???????(8)
- Marginalized ??????????????
47Marginalized ???????????
- ??(???)?????????????
- ????????????????????????
- ??????????(???x???????y????)2)??????????????????
- ????????????????(??????)???????????????????????
- ? ????(Morgan Index)???????
- ?????????????
48 Morgan??????
- ??????????????????????1960?????
- CAS(Chemical Abstract Service)???
- ??????????(???)??????????????????????
- ????????????????
- ???????????????????(?????)
- ? Marginalized ?????????????????????????????????
- ???????????????????????????????
- ? ????????????????????????
49 Morgan??????????
- ?????????1??????
- ?????? x ?????????
- x ?????????????????x ??????
50 ?????
- MUTAG ??????
- ???????????????
- ?????????????????????
- 125?????63???????
- ??1????????????????????????????????
- ??????
- SVM???????GIST (http//microarray.cpmc.columbia.ed
u/gist) - ???
- ?? C ???
51 ???????? ????
Marginalized ???? ?????
???
52 ???????? ????
53 ??
- ?????????????????????
- Marginalized???????????
- ????????
?????
- ????????????????
- ???????
- ?????????(??????)
- ???????
54 ????
- SVM?????????
- N. Cristianini J. Shawe-Taylor An
Introduction to Support Vector Machines and Other
Kernel-based Learning Methods, Cambridge Univ.
Press, 2000. - ????????????????????
- Kernel Methods in Computational Biology, MIT
Press, 2004. - Marginalized Kernel Morgan Index
- P. Mahe, N. Ueda, T. Akutsu, J-L. Perret, J-P.
Vert Extensions of marginalized graph kernels,
Proc. 21st Int. Conf. Machine Learning, 552-559,
2004. - LA????
- H. Saigo, J-P Vert, N. Ueda, T. Akutsu Protein
homology detection using string alignment
kernels, Bioinformatics, 201682-1689, 2004.