Title: ??????????GCR????
1??????????GCR????
- ???????????? ?? ?
- ???????????? ?? ??
- ???????????????????
- ?? ??
- ???????????? ?? ??
2??????????
?????????????????????
3GCR????
- CG???????????????????Eisenstat, 83
- Arnoldi??????????????????? (GMRES???????????)
GCR????
- ?????????????
- ??????????????
- GMRESR?Vorst,91????????????
4GCR?? ???
- ???????(O(k 2N)????3?)
- ??????????(GMRES??2?)
????! ???????!
5?????
- ???????efficient GCR(eGCR)???????Yang,95
??????????????????? ??????????????!!
6????2????????
- Memory efficient GCR(meGCR)?
- ????eGCR??????
- ???????eGCR???????????????
- Unrolled GCR(uGCR)?
- ?????????????eGCR??????
7Efficient GCR?
8Efficient GCR?
??????p?????????? ???p?Ap????????????
9Efficient GCR?
10Memory efficient GCR?
??????????Ap?????
????eGCR?????
????u??????
11Unrolled GCR?
?????????????? ???Air0????????????
Air0?dominant???????? ??????????? ???????????
????(BLAS3)??????? ???
????????????????
??????? meGCR???
12??????
???n? ???-?????
1???????????
???? dmv dp daxpy smv prec bin kmv Dmm
??? 2kn 2n 2n
GCR 3(k-1) 2k 2k-1 k k 0 0 0
eGCR 2k-1 2k k k k 1 1 0
meGCR 2k-1 2k k k k1 1 2 0
uGCR 0 0 0 k k1 1 4k 1
? ????????? ????????????
k ???????(????) n ????? (?????)
13?????????1(????????????)
Vector of length n Buffer of size k 2
GCR 2k3 0
eGCR 2k3 2
meGCR k3 2
uGCR k2 5
k ???????(????) n ????? (?????)
14?????????2(???????????????????????)
method ?????????
GCR
eGCR
meGCR
uGCR
k ???????(????) n ????? (?????) z
??????????????
15????
- ??? HITACHI SR2201
- (????????????)
- CPU 300MFlops 1024PE
- Main memory 256MB/PE
- Communication 300MB/s
- ???????MPI (Message Passing Interface)
16Problems
- Problem 1
- Toeplitz??
- Problem 2
- ???????????????(2??)
- Problem3
- ???????????????(3??)
17meGCR??????(??)
????(?) ???????32
?? Problem 1 Problem 2 Problem 3
??? 400,000 160,000 64,000
????? GCR 22.8 4860 37.8
????? eGCR 18.3 3440 27.7
????? meGCR 17.9 3450 28.7
????? (B-ILU(0)) GCR 21.2 938 21.9
????? (B-ILU(0)) eGCR 19.8 812 19.9
????? (B-ILU(0)) meGCR 20.1 825 20.1
18meGCR??????(????????)
Problem 1 ( n4,000,000 )
Problem 2 ( n160,000 )
Problem 3 ( n512,000 )
???????????32
19meGCR??????(???B-ILU(0)???)
Problem 1 ( n4,000,000 )
Problem 2 ( n160,000 )
Problem 3 ( n512,000 )
???????????32
20uGCR??????
??????? 8
????? ????? B-ILU(0)??? B-ILU(0)???
Iteration Time Iteration Time
Problem 1 (n400,000) GCR 46 18.5 17 20.3
Problem 1 (n400,000) eGCR 46 15.4 17 18.2
Problem 1 (n400,000) meGCR 46 15.4 17 19.5
Problem 1 (n400,000) uGCR 55 13.5 25 26.7
Problem 3 (n64,000) GCR 1096 64.8 150 30.6
Problem 3 (n64,000) eGCR 1096 53.6 150 29.6
Problem 3 (n64,000) meGCR 1096 55.7 150 31.7
Problem 3 (n64,000) uGCR 1053 43.0 150 30.1
21??????
- ???????????
- ???????????
- ???????????
- ??????
- GCR??2?????????????
- Memory efficient GCR?
- ???????????????
- ??????????????????????????????
- Unrolled GCR?
- ??????????????????????
?????????????!
???????????????????