Title: ????????LSA ????? ???????????????
1????????LSA ????????????????????
- ??????????????
- ????????
- ?? ??
2?????
- ???????
- ???????? LSA ????
- LSA????????????
- ????
- ????
- ?????
3???????????
- ??????????(???????????????)????????
- SourceForge(http//sourceforge.net/)
- Corporate Source
- ???????????????????
- SourceForge??55000?????????????(2003/03??)
Jamie Dinkelacker and Pankaj K. Garg Corporate
Source Applying Open Source Concepts to a
Corporate Environment (Position Paper) 1st
ICSE International Workshop on Open Source
Software Engineering, May 15, 2001, Toronto,
Canada.
4??
- ????????????????????????
- ?????????????????
- ??????????????????????????
- ????????
- ?????????????????????
- ?????????, ???????
- ????, ?????????, ???????????????????
5???
- ??????????
- ???????????????????
- ???????????????????????????????????
- ?????????????
- ???????????????????
- ??????????????????????????????????????
- ????????????
6??????
??????1
??????3
????
???
GUI (MFC)
GUI (MFC)
???????? (regexp)
???????? (regexp)
??????2
??????4
????
???
GUI (GTK)
GUI (GTK)
???????? (regexp)
7??
- ???????????
- ???????????????????????????
- ????????????????????????????????
-
- ?????????????????
- ???????????
8?????
- ???????
- ???????? LSA ????
- LSA????????????
- ????
- ????
- ?????
9???????? LSA
- Latent Semantic Analysis
- ???????????????????????
- ??????????????????
- ??????????????????????????????????
Landauer, T. K., Foltz, P. W., Laham, D.
(1998). Introduction to Latent Semantic
Analysis. Discourse Processes, 25, 259-284.
10LSA ??
??1
??4
1 1 2 0 0 0 1 0 0
2 1 1 1 1 1 0 0 0
3 0 1 3 1 0 0 0 0
4 0 0 0 0 0 0 2 0
5 0 0 0 0 0 1 1 2
6 0 0 0 0 1 0 1 1
B
A
C
D
E
F
G
H
A
B
B
F
G
G
??2
??5
A
B
H
G
F
C
D
E
H
??????? ??
??3
??6
D
B
C
G
E
H
C
C
LSA
1 0.3 0.7 0.9 0.4 0.3 0.2 0.3 0.3
2 0.4 1.0 1.4 0.6 0.3 0.2 0.1 0.1
3 0.6 1.5 2.3 1.0 0.4 0.2 -0.2 -0.2
4 0.1 0.1 -0.2 0.0 0.2 0.4 0.9 0.9
5 0.1 0.2 -0.2 0.0 0.4 0.6 1.5 1.4
6 0.1 0.2 -0.1 0.0 0.3 0.4 1.0 0.9
B
C
G
H
A
D
E
F
11LSA ???
- ????????????????????
- ???????????????
?????????
1 2 3 4 5 6
1 1.0 0.2 -0.1 -0.3 -0.3 -0.5
2 0.2 1.0 0.5 -0.5 -0.9 -0.5
3 -0.1 0.5 1.0 -0.2 -0.4 -0.5
4 -0.3 -0.5 -0.2 1.0 0.3 0.5
5 -0.3 -0.9 -0.4 0.3 1.0 0.5
6 -0.5 -0.5 -0.5 0.5 0.5 1.0
1 2 3 4 5 6
1 1.0 1.0 0.9 -0.6 -0.6 -0.5
2 1.0 1.0 1.0 -0.8 -0.8 -0.7
3 0.9 1.0 1.0 -0.8 -0.8 -0.8
4 -0.6 -0.8 -0.8 1.0 1.0 1.0
5 -0.6 -0.8 -0.8 1.0 1.0 1.0
6 -0.5 -0.7 -0.8 1.0 1.0 1.0
LSA???
LSA???
12?????
- ???????
- ???????? LSA ????
- LSA????????????
- ????
- ????
- ?????
13LSA??????????????
- ?????????? LSA ???
- ?? ? ??????
- ?? ? ???(??????????)
- LSA??????????????????????
- ??????????????????????
- ??????????????????????????
14????????????
- ???????????????????????????
- ??????????????????????
??????1
??????3
????
???
GUI (MFC)
GUI (MFC)
???????? (regexp)
???????? (regexp)
??????2
??????4
????
???
GUI (GTK)
GUI (GTK)
???????? (regexp)
15?????
- ???????
- ???????? LSA ????
- LSA????????????
- ????
- ????
- ?????
16????????
- ???????????????????????
- window ????????????????????????? GUI ????????????
- ?????????????????????????????????????????????
??????1
??????3
????
???
GUI (MFC)
GUI (MFC)
window
menuBar
cmdButton
window
MFC
17????(1/2)
Sof1
Soft4
Soft1
Soft4
G
G
A
B
B
F
J
J
I
Soft2
Soft5
Soft2
Soft5
1.???? ??
A
B
C
D
E
H
G
F
H
J
Soft3
Soft6
Soft3
Soft6
G
E
H
D
B
C
C
C
J
2.???????
1 1 2 0 0 0 1 0 0
2 1 1 1 1 1 0 0 0
3 0 1 3 1 0 0 0 0
4 0 0 0 0 0 0 2 0
5 0 0 0 0 0 1 1 2
6 0 0 0 0 1 0 1 1
1 1 2 0 0 0 1 0 0 0 1
2 1 1 1 1 1 0 0 0 0 0
3 0 1 3 1 0 0 0 0 0 0
4 0 0 0 0 0 0 2 0 1 1
5 0 0 0 0 0 1 1 2 0 1
6 0 0 0 0 1 0 1 1 0 1
I
J
H
H
B
A
C
D
E
F
G
B
A
C
D
E
F
G
3.??????, ??????? ??
18????(2/2)
1 0.3 0.7 0.9 0.4 0.3 0.2 0.3 0.3
2 0.4 1.0 1.4 0.6 0.3 0.2 0.1 0.1
3 0.6 1.5 2.3 1.0 0.4 0.2 -0.2 -0.2
4 0.1 0.1 -0.2 0.0 0.2 0.4 0.9 0.9
5 0.1 0.2 -0.2 0.0 0.4 0.6 1.5 1.4
6 0.1 0.2 -0.1 0.0 0.3 0.4 1.0 0.9
1 1 2 0 0 0 1 0 0
2 1 1 1 1 1 0 0 0
3 0 1 3 1 0 0 0 0
4 0 0 0 0 0 0 2 0
5 0 0 0 0 0 1 1 2
6 0 0 0 0 1 0 1 1
B
C
G
H
A
D
E
F
B
A
C
D
E
F
G
H
4.LSA
5.???????????? ??????
1
2
3
1
2
3
D
B
A
C
ClusterName1
G
F
H
7.???? ????? ??
6.?????? ????? ??
4
5
6
1
4
5
6
1
ClusterName2
191.??????
- ????
- ??????????????????
- ???????????????????????????
Sof1
Soft4
Soft1
Soft4
G
G
A
B
B
F
J
J
I
Soft2
Soft5
Soft2
Soft5
1.???? ??
A
B
C
D
E
H
G
F
H
J
Soft3
Soft6
Soft3
Soft6
G
E
H
D
B
C
C
C
J
202.???????
- ????
- ??????????????????????????????????????????????
Sof1
Soft4
1 1 2 0 0 0 1 0 0 0 1
2 1 1 1 1 1 0 0 0 0 0
3 0 1 3 1 0 0 0 0 0 0
4 0 0 0 0 0 0 2 0 1 1
5 0 0 0 0 0 1 1 2 0 1
6 0 0 0 0 1 0 1 1 0 1
I
J
B
A
C
D
E
F
G
H
G
G
A
B
B
F
J
J
I
Soft2
Soft5
A
B
C
D
E
H
G
F
H
J
2.????? ??
Soft3
Soft6
G
E
H
D
B
C
C
C
J
213.????????????????
- ??????
- ?????????????????????
- ??????
- ???????????????????
- ?????????????????????
1 1 2 0 0 0 1 0 0
2 1 1 1 1 1 0 0 0
3 0 1 3 1 0 0 0 0
4 0 0 0 0 0 0 2 0
5 0 0 0 0 0 1 1 2
6 0 0 0 0 1 0 1 1
1 1 2 0 0 0 1 0 0 0 1
2 1 1 1 1 1 0 0 0 0 0
3 0 1 3 1 0 0 0 0 0 0
4 0 0 0 0 0 0 2 0 1 1
5 0 0 0 0 0 1 1 2 0 1
6 0 0 0 0 1 0 1 1 0 1
I
J
H
H
B
A
C
D
E
F
G
B
A
C
D
E
F
G
3.??????, ??????? ??
224.LSA
- ????????????????????????? LSA ???
- LSA ?????????????????????????????????????????
1 0.3 0.7 0.9 0.4 0.3 0.2 0.3 0.3
2 0.4 1.0 1.4 0.6 0.3 0.2 0.1 0.1
3 0.6 1.5 2.3 1.0 0.4 0.2 -0.2 -0.2
4 0.1 0.1 -0.2 0.0 0.2 0.4 0.9 0.9
5 0.1 0.2 -0.2 0.0 0.4 0.6 1.5 1.4
6 0.1 0.2 -0.1 0.0 0.3 0.4 1.0 0.9
1 1 2 0 0 0 1 0 0
2 1 1 1 1 1 0 0 0
3 0 1 3 1 0 0 0 0
4 0 0 0 0 0 0 2 0
5 0 0 0 0 0 1 1 2
6 0 0 0 0 1 0 1 1
B
C
G
H
A
D
E
F
B
A
C
D
E
F
G
H
4.LSA
235.???????????
- LSA ????????????????
- ??????????????
- ??????????????????????
1 0.3 0.7 0.9 0.4 0.3 0.2 0.3 0.3
2 0.4 1.0 1.4 0.6 0.3 0.2 0.1 0.1
3 0.6 1.5 2.3 1.0 0.4 0.2 -0.2 -0.2
4 0.1 0.1 -0.2 0.0 0.2 0.4 0.9 0.9
5 0.1 0.2 -0.2 0.0 0.4 0.6 1.5 1.4
6 0.1 0.2 -0.1 0.0 0.3 0.4 1.0 0.9
B
C
G
H
A
D
E
F
5.???? ???????
B
A
G
F
C
D
H
246.?????????????
- ?????????????????????????????
- ?????????????????????????????
Sof1
Soft4
G
G
A
B
B
F
J
J
I
B
A
G
F
C
D
H
Soft2
Soft5
6.??????????? ??
A
B
C
D
E
H
G
F
H
J
Soft3
Soft6
1
2
3
6
4
5
1
G
E
H
D
B
C
C
C
J
257.???????????
- ????????????????????
- LSA ???????????????????????
- ????????????????????????
- ???????????????????????????????????????????
7.???????????
1
2
3
4
5
6
1
1
2
3
4
5
6
1
ClusterName1
ClusterName2
26????????
- ?? C??????????????
- ?????????????????????
- ???? Perl
- ??????????? C ???
- LSA ?????? SVDPACKC ???
- ????4000?
27?????
- ???????
- ???????? LSA ????
- LSA????????????
- ????
- ????
- ?????
28??
- ???????????????????????
- ????
- SourceForge ?????? 6 ?????????
- boardgames, compilers, database, editor,
videoconversion, xterm - ????? C ????????????????????????
- ??? 41 ????????
- ?? 164102????????
- ????????????????????????????? 22048 ????
29??????????(??)
???? ?????? ?????
AOP, emitcode, IC_RESULT, IC_LEFT, aop, aopGet, IC_RIGHT, pic14_emitcode, iCode, etype compilers/gbdk, compilers/sdcc 8597
CASE_IGNORE, CASE_GROUND_STATE, screen, CASE_PRINT, CASE_BYP_STATE, Widget, TScreen, CASE_IGNORE_STATE, CASE_PLT_VEC, CASE_PT_POINT xterm/R6.3, xterm/R6.4 2160
YY_BREAK, yyvsp, yyval, DATA, yy_current_buffer, tuple, yy_current_state, yy_c_buf_p, yy_cp, uint32 compilers/gbdk, database/mysql-3.23.49, database/postgresql-7.2.1 223
AVI, cinfo, OUTLONG, avi_t, AVI_errno, hdrl_data, OUT4CC, nhb, ERR_EXIT, str2ulong videoconversion/dv2jpg-1.1, videoconversion/libcu30-1.0, videoconversion/mjpgTools 177
board, num_moves, ply, pawn_file, npiece, pawns, moves, white_to_move, move_s, promoted boardgame/Sjeng-10.0, boardgame/cinag-1.1.4, boardgame/faile_1_4_4 154
GtkWidget, gchar, gpointer, gint, widget, gtk_widget_show, N_, g_free, dialog, g_return_if_fail boardgame/gbatnav-1.0.4, editor/gedit-1.120.0, editor/gmas-1.1.0, editor/gnotepad-1.3.3, editor/peacock-0.4 104
30????
- ??40??????
- ???????????????
- GTK(2????) GUI ?????
- yacc(2????) ?????????
- regexp ???????????
- getopt ??????????
- JNI Java ??????????????????????
- Python/C Python ???????????????
??????? 8
????????????? 18
31??
- ????????????????????????????
- ?????????????????????
- ??????????????
- ????????
- ????????????????????????
- ?????????????????????????????????????????
32?????????
- ????????????????????
- ??????????????????????????????????????????????????
???????? - ?????
- ????????????????????
- ??????
33?
34??
- ?????????????????
- ????????????????
- ?????????????????
- ?????????????
- ??????
- ??????
- ????????
- ?????????????????????????
- ???????????????????????????????
35????
- 40??????
- ??????? 20 ??????
???????????? 18
?????????????????? (GTK(2????), yacc(2????), regexp, JNI, getopt, Python/C) 8
??? 14
???????????? 14
?????????????????? (GTK(2????), yacc) 3
??? 3
36??????
- ???(precision)
- ??????????????????????????????
- ???(recall)
- ????????????????????????????????
- ??????????????????????????
????? ??20????
??? 0.65 0.85
??? 0.16 0.13
37????
- ???????????????, ????????????????????????????
- ????????????????
- ????, ??
- ????????????????, ???????????
- ?????
- ??????????(??????)
- ?????????
N. Anquetil and T. Lethbridge. Extracting
concepts from file names a new file clustering
criterion. In Proc. 20th Intl. Conf. Software
Engineering, May 1998.
G. A. Di Lucca, A. R. Fasolino, F. Pace,
P. Tramontana, U. De Carlini, Comprehending
Web Applications by a Clustering Based Approach
10th International Workshop on Program
Comprehension (IWPC'02)
Jonathan I. Maletic and Andrian Marcus,
Supporting Program Comprehension Using Semantic
and Structural Information in Proceedings
of the 23rd IEEE International Conference on
Software Engineering (ICSE 2001)
38?????????
- ????????????
- ????????????????????????????????????
- ????????????????????
- ?????????????????, ??????????????
- ????????????????????????
- ?????????????????????
39LSA (Latent Semantic Analysis)
- ???????
- ?????????????????????????
- ??????????????????????
- ???????????????
Landauer, T. K., Foltz, P. W., Laham, D.
(1998). Introduction to Latent Semantic
Analysis. Discourse Processes, 25, 259-284.