Title: ????:
1???? DVM-?????????? ?????????? ????????????
???????? ??? ?????????????? ?????????
?????? ???????? ????????????? ?.?.-?.?., ???.
???????? ????????? ?????????? ?????????? ??
?.?.??????? ??? ????????? ??????? ??????????
???????????????? ?????????? ??? ???????????
???????????????? ???????????? ??. ?.?.
?????????? bakhtin_at_keldysh.ru
2??????????
- ???????
- MPI ?????? ???????? ?????????
- DVM ?????? ???????????? ?? ?????? ? ??????????.
???? ??-DVM - ?????????
- ???????????? ? ????????????? ??????????.
SMP-???????? - ????????? ?????? ???????????????? MPI/OpenMP
- ???? Fortran-DVM/OpenMP
- ???????
- ????????? ?????????????? ????????
- PGI Accelerator Model
- ???? Fortran-DVM/OpenMP/Accelerator
3???????? ?????. ???????????????? ??????
- / Jacobi program /
- include ltstdio.hgt
- define L 1000
- define ITMAX 100
- int i,j,it
- double ALL
- double BLL
- int main(int an, char as)
-
- printf("JAC STARTED\n")
- for(i0iltL-1i)
- for(j0jltL-1j)
-
- Aij0.
- Bij1.ij
-
4???????? ?????. ???????????????? ??????
- / iteration loop
/ - for(it1 itltITMAXit)
-
- for(i1iltL-2i)
- for(j1jltL-2j)
- Aij
Bij - for(i1iltL-2i)
- for(j1jltL-2j)
- Bij
(Ai-1jAi1jAij-1Aij1)/4. -
- return 0
-
5???????? ?????. MPI-??????
6???????? ?????. MPI-??????
- / Jacobi-1d program /
- include ltmath.hgt
- include ltstdlib.hgt
- include ltstdio.hgt
- include "mpi.h"
- define m_printf if (myrank0)printf
- define L 1000
- define ITMAX 100
- int i,j,it,k
- int ll,shift
- double ( A)L
- double ( B)L
7???????? ?????. MPI-??????
- int main(int argc, char argv)
-
- MPI_Request req4
- int myrank, ranksize
- int startrow,lastrow,nrow
- MPI_Status status4
- double t1, t2, time
- MPI_Init (argc, argv) / initialize MPI system
/ - MPI_Comm_rank(MPI_COMM_WORLD, myrank)/my place
in MPI system/ - MPI_Comm_size (MPI_COMM_WORLD, ranksize) /
size of MPI system / - MPI_Barrier(MPI_COMM_WORLD)
- / rows of matrix I have to process /
- startrow (myrank L) / ranksize
- lastrow (((myrank 1) L) / ranksize)-1
- nrow lastrow - startrow 1
- m_printf("JAC1 STARTED\n")
8???????? ?????. MPI-??????
- / dynamically allocate data structures /
- A malloc ((nrow2) L sizeof(double))
- B malloc ((nrow) L sizeof(double))
- for(i1 iltnrow i)
- for(j0 jltL-1 j)
-
- Aij0.
- Bi-1j1.startrowi-1j
9???????? ?????. MPI-??????
- / iteration loop
/ - t1MPI_Wtime()
- for(it1 itltITMAX it)
-
- for(i1 iltnrow i)
-
- if (((i1)(myrank0))((inrow)(myrankra
nksize-1))) continue - for(j1 jltL-2 j)
-
- Aij Bi-1j
-
10???????? ?????. MPI-??????
- if(myrank!0)
- MPI_Irecv(A00,L,MPI_DOUBLE, myrank-1,
1235, -
MPI_COMM_WORLD, req0) - if(myrank!ranksize-1)
- MPI_Isend(Anrow0,L,MPI_DOUBLE,
myrank1, 1235, -
MPI_COMM_WORLD,req2) - if(myrank!ranksize-1)
- MPI_Irecv(Anrow10,L,MPI_DOUBLE,
myrank1, 1236, -
MPI_COMM_WORLD, req3) - if(myrank!0)
- MPI_Isend(A10,L,MPI_DOUBLE,
myrank-1, 1236, -
MPI_COMM_WORLD,req1) - ll4 shift0
- if (myrank0) ll2shift2
- if (myrankranksize-1) ll2
- MPI_Waitall(ll,reqshift,status0)
11???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue for(j1 jltL-2
j) Bi-1j (Ai-1jAi1j
Aij-1Aij1)/4. /DO
it/ printf("d Time of tasklf\n",myrank,MPI_W
time()-t1) MPI_Finalize () return 0
12???????? ?????. DVM-??????
- include ltstdio.hgt
- define L 1000
- define ITMAX 100
- int i,j,it
- define DVM(dvmdir)
- define DO(v,l,h,s) for(v(l) vlt(h) v(s))
- DVM(DISTRIBUTE BLOCKBLOCK) double ALL
- DVM(ALIGN ij WITH Aij) double BLL
- int main(int an, char as)
-
- printf("JAC STARTED\n")
- DVM(PARALLEL ij ON Aij)
- DO(i,0,L-1,1)
- DO(j,0,L-1,1)
-
- Aij0.
- Bij1.ij
-
13???????? ?????. DVM-??????
- / iteration loop
/ - for(it1 itltITMAXit)
-
- DVM(PARALLEL ij ON Aij)
- DO(i,1,L-2,1)
- DO(j,1,L-2,1)
- Aij Bij
- DVM(PARALLEL ij ON Bij
SHADOW_RENEW A) - DO(i,1,L-2,1)
- DO(j,1,L-2,1)
- Bij
(Ai-1jAi1jAij-1Aij1)/4. -
- return 0
-
14?????? ???????????? ?? ?????? ? ??????????. DVM
- ??? ?????? (1993 ?.), ?????????? ? ?????? ??????
????????????? ???????????????? ???????-DVM ?
??-DVM, ?????????? ??????????? ??????
???????????? ?? ?????? ? ?????? ???????????? ??
?????????? - ???????????? ?? ???? ?????? ??????? ??????????
???????????? ???????? (DVM) ??????? ? ??? ??.
?.?. ??????? ??? - ???????????? DVM (Distributed Virtual Memory,
Distributed Virtual Machine) ???????? ?????????
??????????? ????? ?????? ?? ??????????????
????????
15?????? ???????????????? DVM
- ?? ???????????? ??????????? ??????????????? ??
?????????? ??????? ??????????? ?????????? - ??????????? ?????????? ????? (?????????) ??????,
?.?. ??????, ??????????? ?? ????? ??????????? ?
???????????? ?? ?????? ??????????? - ??????????? ???????? ????? ? ????????????????
?????????, ??? ?????????? ?????????? ????????
????? ?????? - ??????????? ???????????? ?? ???????????
??????????? ???????????? ?????? ?? ?????? ??????,
?? ? ??????????????? ??????????
16?????? DVM-???????
- DVM-??????? ??????? ?? ????????? ?????????
- ?????????? Fortran-DVM/OpenMP
- ?????????? C-DVM
- ?????????? ????????? LIB-DVM
- DVM-????????
- ????????????? ?????????? DVM-????????
- ?????????? ?????????????????? DVM-????????
17???????? ????????????????
- C-DVM ???? ?? ??????????? ???????
- Fortran-DVM/OpenMP ???? ??????? 95
??????????? ??????????? - ??????????? ??????????? ? ??????? ????????
???????????????? ?????????????? ???????????? ?
???????? ???????????????? ????????? - ??????????? ?????????????? ???????? ?????? ?
????????????? - ???????????????? ????? ????????????????
- ???????????? ???????????? ???????? ???
??????????? ???????????? - ?????????? ?????? ???? ????????? ????????? ???
????????????????? ? ????????????? ?????
18??????????? ???????????????? ?????????
PARALLEL
?????
???????
PARALLEL
ALIGN
?????? ?????
DISTRIBUTE
???????
?????? ??????????? ???????????
MAP
DISTRIBUTE
?????????? ??????????
19????????????? ??????. DISTRIBUTE
- DVM( DISTRIBUTE f1fk ) lt????????-???????-??-????
?-??gt - ??? fi BLOCK - ????????????? ???????
??????? (?????????????? ?????????) - MULT_BLOCK(m) - ????????????? ???????
???????? m (?????????????? ?????????) - GENBLOCK ( block-array-name ) - ?????????????
??????? ????????? ???????? - WGTBLOCK ( block-array-name,nblock ) -
????????????? ??????????? ??????? - - ????????????? ????? ??????????
(????????? ?????????) - k - ?????????? ????????? ???????
20????????????? ??????. DISTRIBUTE
- DVM(DISTRIBUTE BLOCK) double A12
- DVM(DISTRIBUTE BLOCK) double B6
- DVM(DISTRIBUTE MULT_BLOCK(3)) double
A12 - node1 node2 node3 node4
- A 0,1,2 3,4,5 6,7,8 9,10,11
- B 0,1 2,3 4 5
- double wb61.,0.5,0.5,0.5,0.5,1.
- int bs42,4,4,2
- DVM(DISTRIBUTE GEN_BLOCK(bs)) double
A12 - DVM(DISTRIBUTE WGT_BLOCK(wb,6)) double
B6 - node1 node2 node3 node4
- A 0,1 2,3,4,5 6,7,8,9 10,11
- B 0 1,2 3,4 5
21????????????? ??????. DISTRIBUTE
- DVM(DISTRIBUTE BLOCK BLOCK) float
ANNN -
- dvm run M1 M2 lt???-?????????gt /P(M1,M2)/
- ????????? ???????????? ?????? ????????? ??????? ?
?? ?????? ????????? P ??????? ??????? N/M1,
?????? ????????? ? - ?? ?????? ????????? P
??????? ??????? N/M2, ? ?????? ????????? ? ?????
??????? ???????????? ?? ?????? ???????????
?????????. - define DVM(dvmdir)
22????????????? ?????????? ??? ???????????? ??????
- ???? ????-????????? ??????????? ?? ????
- ??????????? ? ????????? ?????????????
- ??????
- ??????? ??????????? ??????????
- OWN(Ai) - ?????????, ?? ??????? ???????????
Ai - Ai expri
- ???????? ?????? ??????????? ?? ??????????
OWN(Ai) - ????? ???? - ????? ???????? ?? ???????
??????????? ??????????
23??????????? ??????. ALIGN
- DVM(DISTRIBUTE BBLOCKBLOCK) float BNM1
- . . .
- for (i0 iltN-1i)
- for(j0 jltM-2 j)
-
- Bij1 Aij
-
- DVM(ALIGN ij WITH Bij1) float ANM
24??????????? ??????. ALIGN
- DVM(ALIGN a1 an WITH B b1 bm )
lt????????-???????-A-??-?????-??gt - ??? ai - ???????? i?? ????????? ??????????????
??????? ? - bj - ???????? j?? ????????? ???????? ??????? B
- n - ?????????? ????????? ??????? ?
- m - ?????????? ????????? ??????? ?
- ai IDi bj cIDjd
-
- ??? IDi , IDj - ??????????????
- c, d - ????????????? ?????????
25??????????? ??????. ALIGN
- DVM(ALIGN i WITH B2i1 ) float AN
- ???????????? ??????? A i ? B2i1 ?? ????
?????????. - DVM(ALIGN ij WITH Bji) float ANN
- ???????????? ??????? A i j ? B j i
?? ???? ?????????. - DVM(ALIGN i WITH Bi) float AN
- ???????????? ??????? A i ?? ?? ??????????,
??? ???????? ???? ?? ???? ??????? i-??? ???????
B. - DVM(ALIGN i WITH Bi) float ANN
- ???????????? i-?? ?????? A ? ??????? B i ??
???? ?????????, ?.?. ?????????? ?????? ?????????
??????? ?.
26????????????? ?????? ?????. PARALLEL
- ????????? ????????????? ????? ?? ??????
??????????? ??????? ??????????? (?????-?????????
????) - ????????? ?????????? ????????????? ????? ??
?????? ?????????? ? ???????? ?????????? ?????
(????????????? ????????? ????????????) - ????? ????? ?????? ???? ????????? ???????? ?
??????????? ?? ????? ??????????. ??????? ?????
????? ?????????? ???????????? ?????? ????? ?????
?????? ???? ???????????? ?? ???? ?????????
(???????????? ? ???????? ??????????? ??????????).
27????????????? ?????? ?????. PARALLEL
- FOR(i, N)
-
- D2i
- D2i1
-
- DVM(PARALLEL i ON D2i)
- FOR(i, N)
-
- D2i
-
- DVM(PARALLEL i ON D2i1)
- FOR(i, N)
-
- D2i1
28??????????? ??????. TEMPLATE
- DVM(PARALLEL i ON Ai)
- FOR(i, N)
-
- Ai Bid1 Ci-d2
-
- DVM(DISTRIBUTE BLOCK TEMPLATE Nd1d2) void
TABC - DVM( ALIGN i WITH TABCi ) float
BN - DVM( ALIGN i WITH TABCid2 ) float
AN - DVM( ALIGN i WITH TABCid1d2 ) float
CN
29????????? ?????? ???? SHADOW
- DVM(DISTRIBUTE BLOCK) float AN
- DVM(PARALLEL i ON Ai SHADOW_RENEW B )
- FOR(i, N)
-
- Ai Bid1 Bi-d2
-
- DVM( ALIGN i WITH Ai SHADOW d1d2 )
float BN
30????????? ?????? ???? SHADOW
- DVM(DISTRIBUTE BLOCKBLOCK) float C100100
- DVM(ALIGNIJ WITH CIJ) float A100100,
B100100, D100100 - DVM(SHADOW_GROUP) void AB
- . . .
- DVM(CREATE_SHADOW_GROUP AB A B)
- . . .
- DVM(SHADOW_START AB)
- . . .
- DVM(PARALLELIJ ON CIJ SHADOW_WAIT AB)
- DO( I, 1, 98, 1)
- DO( J, 1, 98, 1)
- CIJ (AI-1JAI1JAIJ-1A
IJ1)/4. - DIJ (BI-1JBI1JBIJ-1B
IJ1)/4. -
31????????? ?????? ???? ACROSS
- DVM(DISTRIBUTE BLOCK SHADOW d1d2 ) float
AN - DVM(PARALLEL i ON Ai ACROSS Ad1d2)
- FOR(i, N)
-
- Ai Aid1 Ai-d2
-
32????????? ?????? ???? REMOTE
- DVM(DISTRIBUTE BLOCK) float AN
- DVM(PARALLEL i ON Ai REMOTE_ACCESS C5
Cin) - FOR(i, N)
-
- Ai C5 Cin
-
33????????? ?????? ???? REMOTE
- DVM (DISTRIBUTE BLOCKBLOCK) float
A1MN11, A2M11N21, A3M21N21 - DVM (REMOTE_GROUP) void RS
- DO(ITER,1, MIT,1)
- . . .
- DVM (PREFETCH RS)
- . . .
- DVM ( PARALLELi ON A1iN1 REMOTE_ACCESS RS
A2i1) - DO(i,0, M1-1,1)
- A1iN1 A2i1
- DVM (PARALLELi ON A1iN1 REMOTE_ACCESS RS
A3i-M11) - DO(i,M1, M-1,1)
- A1iN1 A3i-M11
- DVM (PARALLELi ON A2i0 REMOTE_ACCESS RS
A1IN1-1) - DO(i,0, M1-1,1)
- A2i0 A1iN1-1
- DVM(PARALLELi ON A3i0 REMOTE_ACCESS RS
A1IM1N1-1) - DO (i,0, M2-1,1)
- A3i0 A1iM1N1-1
34????????? ?????? ???? REDUCTION
- DVM(DISTRIBUTE BLOCK) float AN
- DVM(PARALLEL i ON Ai REDUCTION SUM(S) )
- FOR(i, N)
-
- Ai Bi Ci
- s s Ai
-
- DVM( ALIGN i WITH Ai) float BN
- DVM( ALIGN i WITH Ai) float CN
- ? ???????????? ?????????? ?????????
- SUM, PRODUCT, AND, OR, MAX, MIN, MAXLOC, MINLOC
35????????? ?????? ???? REDUCTION
- DVM(REDUCTION_GROUP) void RG
- S 0 X A1 Y A1 MINI 1
- DVM(PARALLELI ON AI REDUCTION RG SUM(S),
MAX(X), MINLOC(Y,MIMI)) - FOR(I, N)
- S S AI
- X max(X, AI)
- if(AI lt Y)
- Y AI
- MINI I
-
-
- DVM(REDUCTION_START RG)
- DVM(PARALLELI ON BI)
- FOR( I, N)
- BI CI AI
- DVM(REDUCTION_WAIT RG)
36??????????? ?????? ????????
- DVM(DISTRIBUTE BLOCK) float ANN
- DVM(ALIGN ij WITH ji) float BNN
- . . .
- DVM(COPY)
- FOR(i,N)
- FOR(j,N)
- BijAij
37??????????? ?????? ????????
- DVM(DISTRIBUTE BLOCK) float ANN
- DVM(ALIGN ij WITH ji) float BNN
- . . .
- DVM(COPY_FLAG) void flag
- . . .
- DVM(COPY_START flag)
- FOR(i,N)
- FOR(j,N)
- BijAij
- . . .
- DVM(COPY_WAIT flag)
38????????? ?????? MPI/OpenMP
??????
??????????
MPI
??????
??????????
OpenMP
???? N
39???????????? ? ????????????? ??????????
?????????? Intel Xeon ????? 5000
X5680 6 cores X5677 4 cores
3330 MHz 3460 MHz
?????????? Intel Xeon ????? 7000
X7560 8 cores X7542 6 cores
2226 MHz 2666 MHz
?????????? AMD Opteron ????? 4100
41KX HE 6 cores 41QS HE 4 cores
2200 MHz 2500 MHz
40??????????? ????????????? ? ????? OpenMP ??????
MPI
- ??????????? ???????????????? ?????????????????.
- ????????? ???????????????? ? ????????????? ??
???????????? ???????????, ?????????? ???
?????? ???????. - ?????????? ??? ?????????? ???????????? ?????? ?
??????, ????????????? MPI-??????????. - ?????????????? ??????? ???????????? ?? OpenMP
??????????? ?????, ??? ?? MPI.
41???????????? OpenMP ??? ???????????? ???????????
- ?????? ??????????? ?????? ? ??? ??????,
???????????? ? ??????? ?? ???? ????, ?????
??????????? ???????? OpenMP ???????? ??????
?????????? ????? ?????. - ???? ?????????? ????? ???-??????, ??? ?????????
????????? ??? ??????????? ?????????.
42National Institute for Computational Sciences.
University of Tennessee
- ?????????????? Kraken Cray XT5-HE Opteron Six
Core 2.6 GHz - 4 ????? ? TOP 500
- http//nics.tennessee.edu
- ??????? ?????????????????? - 1028.85 TFlop/s
- ????? ???????????/???? ? ??????? 16 288 / 98
928 - ?????????????????? ?? Linpack - 831.7 TFlop/s
(81 ?? ???????) - Updrage ?????? 4-? ??????? ??????????? AMD
Opteron ?? 6-?? ??????? ?????????? AMD Opteron - ????????? 6-?? ????? ? TOP500 ? ???? 2009 -
3-?? ????? ? TOP500 ? ?????? 2009
43National Institute for Computational Sciences.
University of Tennessee
44???????????????? ????????????????? ?????
?????????? ???????? ????
- ?????????????? MVS-100K
- 46 ????? ? TOP 500
- http//www.jscc.ru/
- ??????? ?????????????????? - 140.16 TFlop/s
- ????? ???????????/???? ? ??????? 2 920/11 680
- ?????????????????? ?? Linpack - 107.45 TFlop/s
(76.7 ?? ???????) - Updrage ?????? 2-? ??????? ??????????? Intel
Xeon 53xx ?? 4-? ??????? ?????????? Intel Xeon
54xx - ????????? 57-?? ????? ? TOP500 ? ???? 2008 -
36-?? ????? ? TOP500 ? ?????? 2008
45Oak Ridge National Laboratory
- ?????????????? Jaguar Cray XT5-HE Opteron Six
Core 2.6 GHz - 1 ????? ? TOP 500
- http//computing.ornl.gov
- ??????? ?????????????????? - 2331 TFlop/s
- ????? ???? ? ??????? 224 162
- ?????????????????? ?? Linpack - 1759 TFlop/s
(75.4 ?? ???????) - Updrage ?????? 4-? ??????? ??????????? AMD
Opteron ?? 6-?? ??????? ?????????? AMD Opteron - ????????? 2-?? ????? ? TOP500 ? ???? 2009 -
1-?? ????? ? TOP500 ? ?????? 2009
46Oak Ridge National Laboratory
MIN Cores MAX Cores MAXIMUM WALL-TIME (HOURS)
135 000 24
45 000 134 999 24
4 500 44 999 12
1 250 4 499 6
1 1 249 2
47Cray MPI ????????? ?? ?????????
MPI Environment Variable Name 1,000 PEs 10,000 PEs 50,000 PEs 100,000 Pes
MPI Environment Variable Name 128,000 Bytes 20,480 4096 2048
MPICH_UNEX_BUFFER_SIZE (The buffer allocated to hold the unexpected Eager data) 60 MB 60 MB 150 MB 260 MB
MPICH_PTL_UNEX_EVENTS (Portals generates two events for each unexpected message received) 20,480 events 22,000 110,000 220,000
MPICH_PTL_UNEX_EVENTS (Portals generates two events for each unexpected message received) 2048 events 2500 12,500 25,000
48???????? ?????. ???????????????? ??????
- / Jacobi program /
- include ltstdio.hgt
- define L 1000
- define ITMAX 100
- int i,j,it
- double ALL
- double BLL
- int main(int an, char as)
-
- printf("JAC STARTED\n")
- for(i0iltL-1i)
- for(j0jltL-1j)
-
- Aij0.
- Bij1.ij
-
49???????? ?????. ???????????????? ??????
- / iteration loop
/ - for(it1 itltITMAXit)
-
- for(i1iltL-2i)
- for(j1jltL-2j)
- Aij
Bij - for(i1iltL-2i)
- for(j1jltL-2j)
- Bij
(Ai-1jAi1jAij-1Aij1)/4. -
- return 0
-
50???????? ?????. MPI-??????
51???????? ?????. MPI-??????
- / Jacobi-1d program /
- include ltmath.hgt
- include ltstdlib.hgt
- include ltstdio.hgt
- include "mpi.h"
- define m_printf if (myrank0)printf
- define L 1000
- define ITMAX 100
- int i,j,it,k
- int ll,shift
- double ( A)L
- double ( B)L
52???????? ?????. MPI-??????
- int main(int argc, char argv)
-
- MPI_Request req4
- int myrank, ranksize
- int startrow,lastrow,nrow
- MPI_Status status4
- double t1, t2, time
- MPI_Init (argc, argv) / initialize MPI system
/ - MPI_Comm_rank(MPI_COMM_WORLD, myrank)/my place
in MPI system/ - MPI_Comm_size (MPI_COMM_WORLD, ranksize) /
size of MPI system / - MPI_Barrier(MPI_COMM_WORLD)
- / rows of matrix I have to process /
- startrow (myrank L) / ranksize
- lastrow (((myrank 1) L) / ranksize)-1
- nrow lastrow - startrow 1
- m_printf("JAC1 STARTED\n")
53???????? ?????. MPI-??????
- / dynamically allocate data structures /
- A malloc ((nrow2) L sizeof(double))
- B malloc ((nrow) L sizeof(double))
- for(i1 iltnrow i)
- for(j0 jltL-1 j)
-
- Aij0.
- Bi-1j1.startrowi-1j
54???????? ?????. MPI-??????
- / iteration loop
/ - t1MPI_Wtime()
- for(it1 itltITMAX it)
-
- for(i1 iltnrow i)
-
- if (((i1)(myrank0))((inrow)(myrankra
nksize-1))) continue - for(j1 jltL-2 j)
-
- Aij Bi-1j
-
55???????? ?????. MPI-??????
- if(myrank!0)
- MPI_Irecv(A00,L,MPI_DOUBLE,
myrank-1, 1235, - MPI_COMM_WORLD, req0)
- if(myrank!ranksize-1)
- MPI_Isend(Anrow0,L,MPI_DOUBLE,
myrank1, 1235, - MPI_COMM_WORLD,req2)
- if(myrank!ranksize-1)
- MPI_Irecv(Anrow10,L,MPI_DOUBLE, myrank1,
1236, MPI_COMM_WORLD, req3) - if(myrank!0)
- MPI_Isend(A10,L,MPI_DOUBLE, myrank-1, 1236,
MPI_COMM_WORLD,req1) - ll4 shift0
- if (myrank0) ll2shift2
- if (myrankranksize-1) ll2
- MPI_Waitall(ll,reqshift,status0)
56???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue for(j1 jltL-2
j) Bi-1j (Ai-1jAi1j
Aij-1Aij1)/4. /DO
it/ printf("d Time of tasklf\n",myrank,MPI_W
time()-t1) MPI_Finalize () return 0
57???????? ?????. MPI-??????
58???????? ?????. MPI-??????
/Jacobi-2d program / include ltmath.hgt include
ltstdlib.hgt include ltstdio.hgt include
"mpi.h" define m_printf if (myrank0)printf def
ine L 1000 define LC 2 define ITMAX 100 int
i,j,it,k double ( A)L/LC2 double (
B)L/LC
59???????? ?????. MPI-??????
int main(int argc, char argv) MPI_Request
req8 int myrank, ranksize int
srow,lrow,nrow,scol,lcol,ncol MPI_Status
status8 double t1 int isper 0,0 int
dim2 int coords2 MPI_Comm
newcomm MPI_Datatype vectype int
pleft,pright, pdown,pup MPI_Init (argc,
argv) / initialize MPI system
/ MPI_Comm_size (MPI_COMM_WORLD, ranksize)
/ size of MPI system / MPI_Comm_rank
(MPI_COMM_WORLD, myrank) / my place in MPI
system /
60???????? ?????. MPI-??????
dim0ranksize/LC dim1LC if
((Ldim0)(Ldim1)) m_printf("ERROR
arraydd is not distributed on dd
processors\n",L,L,dim0,dim1) MPI_Finalize()
exit(1) MPI_Cart_create(MPI_COMM_WORLD,2,
dim,isper,1,newcomm) MPI_Cart_shift(newcomm,0,1,
pup,pdown) MPI_Cart_shift(newcomm,1,1,pleft,
pright) MPI_Comm_rank (newcomm, myrank) /
my place in MPI system / MPI_Cart_coords(newcomm,
myrank,2,coords)
61???????? ?????. MPI-??????
/ rows of matrix I have to process / srow
(coords0 L) / dim0 lrow (((coords0
1) L) / dim0)-1 nrow lrow - srow 1 /
columns of matrix I have to process / scol
(coords1 L) / dim1 lcol (((coords1
1) L) / dim1)-1 ncol lcol - scol
1 MPI_Type_vector(nrow,1,ncol2,MPI_DOUBLE,vecty
pe) MPI_Type_commit(vectype) m_printf("JAC2
STARTED on dd processors with dd array,
itd\n",dim0,dim1,L,L,ITMAX) /
dynamically allocate data structures / A
malloc ((nrow2) (ncol2) sizeof(double)) B
malloc (nrow ncol sizeof(double))
62???????? ?????. MPI-??????
for(i0 iltnrow-1 i) for(j0
jltncol-1 j) Ai1j10. Bij
1.srowiscolj / iteration loop
/ MPI_Barrier(newcomm)
t1MPI_Wtime() for(it1 itltITMAX
it) for(i0 iltnrow-1 i)
if (((i0)(pupMPI_PROC_NULL))((inrow-
1)(pdownMPI_PROC_NULL))) continue
for(j0 jltncol-1 j)
if (((j0)(pleftMPI_PROC_NULL))((jnc
ol-1)(prightMPI_PROC_NULL))) continue
Ai1j1 Bij
63???????? ?????. MPI-??????
MPI_Irecv(A01,ncol,MPI_DOUBLE, pup,
1235, MPI_COMM_WORLD, req0) MPI_Isend(Anro
w1,ncol,MPI_DOUBLE, pdown, 1235,
MPI_COMM_WORLD,req1) MPI_Irecv(Anrow11
,ncol,MPI_DOUBLE, pdown, 1236,
MPI_COMM_WORLD, req2) MPI_Isend(A11,nc
ol,MPI_DOUBLE, pup, 1236,
MPI_COMM_WORLD,req3) MPI_Irecv(A10,1,ve
ctype, pleft, 1237, MPI_COMM_WORLD,
req4) MPI_Isend(A1ncol,1,vectype,
pright, 1237, MPI_COMM_WORLD,req5) MPI_Ir
ecv(A1ncol1,1,vectype, pright,
1238, MPI_COMM_WORLD, req6) MPI_Isend(A1
1,1,vectype, pleft, 1238,
MPI_COMM_WORLD,req7) MPI_Waitall(8,req,statu
s)
64???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(pupMPI_PROC_NULL))
((inrow)(pdownMPI_PROC_NULL)))
continue for(j1 jltncol
j) if (((j1)(pleftMPI_PROC_NULL
)) ((jncol)(prightMPI_PROC_NULL)))
continue Bi-1j-1 (Ai-1jAi1j
Aij-1Aij1)/4.
printf("d Time of tasklf\n",myrank,MPI_Wtime()
-t1) MPI_Finalize () return
0
65???????? ?????. MPI/OpenMP-??????
- / iteration loop
/ - t1MPI_Wtime()
- pragma omp parallel default(none)
private(it,i,j) shared (A,B,myrank,
nrow,ranksize,ll,shift,req,status) - for(it1 itltITMAX it)
-
- for(i1 iltnrow i)
-
- if (((i1)(myrank0))((inrow)(myrankra
nksize-1))) continue - pragma omp for nowait
- for(j1 jltL-2 j)
-
- Aij Bi-1j
-
66???????? ?????. MPI/OpenMP-??????
- pragma omp barrier
- pragma omp single
-
- if(myrank!0)
- MPI_Irecv(A00,L,MPI_DOUBLE, myrank-1, 1235,
MPI_COMM_WORLD, req0) - if(myrank!ranksize-1)
- MPI_Isend(Anrow0,L,MPI_DOUBLE, myrank1,
1235, MPI_COMM_WORLD,req2) - if(myrank!ranksize-1)
- MPI_Irecv(Anrow10,L,MPI_DOUBLE, myrank1,
1236, MPI_COMM_WORLD, req3) - if(myrank!0)
- MPI_Isend(A10,L,MPI_DOUBLE, myrank-1, 1236,
MPI_COMM_WORLD,req1) - ll4 shift0 if (myrank0) ll2shift2
- if (myrankranksize-1) ll2
- MPI_Waitall(ll,reqshift,status0)
-
67???????? ?????. MPI/OpenMP-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue pragma omp for
nowait for(j1 jltL-2 j) Bi-1j
(Ai-1jAi1j Aij-1Aij1)/4
. /DO it/ printf("d Time of
tasklf\n",myrank,MPI_Wtime()-t1)
MPI_Finalize () return 0
68????????? ?????? DVM/OpenMP
??????
DVM
DVM
??????
??????????
OpenMP
???? N
69???????? ?????. DVM/OpenMP-??????
- PROGRAM JAC_OpenMP_DVM
- PARAMETER (L1000, ITMAX100)
- REAL A(L,L), B(L,L)
- CDVM DISTRIBUTE ( BLOCK, BLOCK) A
- CDVM ALIGN B(I,J) WITH A(I,J)
- PRINT , '
TEST_JACOBI ' - COMP PARALLEL DEFAULT(NONE ) SHARED(A,B)
PRIVATE(IT,I,J) - DO IT 1, ITMAX
- CDVM PARALLEL (J,I) ON A(I, J)
- DO J 2, L-1
- COMP DO
- DO I 2, L-1
- A(I, J)
B(I, J) - ENDDO
- COMP ENDDO NOWAIT
- ENDDO
70???????? ?????. DVM/OpenMP-??????
- COMP BARRIER
- CDVM PARALLEL (J,I) ON B(I, J),
SHADOW_RENEW (A) - DO J 2, L-1
- COMP DO
- DO I 2, L-1
- B(I, J)
(A(I-1, J) A(I, J-1) A(I1, J) A(I, J1))
/ 4 - ENDDO
- COMP ENDDO NOWAIT
- ENDDO
- ENDDO
- COMP END PARALLEL
- END
71????? NASA MultiZone
BT (Block Tridiagonal Solver) 3D ?????-?????,
????? ?????????? ??????????? LU (Lower-Upper
Solver) 3D ?????-?????, ????? ??????? ??????????
SP (Scalar PentadiagonalSolver) 3D ?????-?????,
Beam-Warning approximate factorization http//www
.nas.nasa.gov/News/Techreports/2003/PDF/nas-03-010
.pdf
72????? NASA MultiZone
73???? SP-MZ (????? A) ?? IBM eServer pSeries 690
Regatta
DVM
MPI
74???? LU-MZ (????? A) ?? IBM eServer pSeries 690
Regatta
DVM
MPI
75???? BT-MZ (????? A) ?? IBM eServer pSeries 690
Regatta???? ?? 13 x 13 x 16 ? ?? 58 x 58 x 16
DVM
MPI
76???????????? ????????? ?????? MPI/OpenMP
- ?????????? ??? ?????????? ???????????? ?????? ?
?????? ????. - ?????????????? ??????? ???????????? ?? OpenMP
??????????? ?????, ??? ?? MPI (????????, ????? ?
????????? ???? ??? ?????? ????????????
??????????? ????? ??????????? ? ???????????
?????? ?????????). - ????????? ???????????? ?? ???????????? ???????
??? ??????? ???????????? ?????????? ??? ??????
?????? ????????????.
77???????????? ????????? ?????? DVM/OpenMP
- ???????? ??????? OpenMP ? DVM, ??? ???????? ??
?????????? ?????????????. - ????????? ?????? ????????, ?????????
????????????? ?? ?????????????? ?????
SMP-????????. - ??????????? ????????????? ???????????? ?????????
??? ????????????????, ??? OpenMP-?????????, ???
DVM-?????????, ? ??? DVM/OpenMP -?????????.
78????????? ?????????????? ???????
????????? ?????????????? ??????? ???
???????????? ?????????????? ???????, ? ???????,
?????? ? ?????????????? ????????????,
???????????? ????????? ???? ???????????, ???
?????????????, ?????????? ?????????????
???????????? (?????????, ??????????????,
????????????????? ??? ?????????? ?????? ? ?. ?.).
? ?????? ?????? ?????????? ????? ?????????????,
????????? ???????? ?????????? ????????????.
79?????????????? ???????? ???-???????
80?????? PGI Accelerator ??? Fortran ? ??
!acc data region copy(a(1n,1m))
local(b(2n-1,2m-1)) copyin(w(2n-1)) do
while(resid .gt. tol) resid 0.0 !acc
region do i 2, n-1 do j 2, m-1
b(i,j) 0.25w(i)(a(i-1,j)a(i,j-1)
a(i1,j)a(i,j1))
(1.0-w(i))a(i,j) enddo enddo
81?????? PGI Accelerator ??? Fortran ? ??
do i 2, n-1 do j 2, m-1
resid resid (b(i,j)-a(i,j))2
a(i,j) b(i,j) enddo enddo
!acc end region enddo !acc end data
region http//www.pgroup.com/lit/whitepapers/pgi_
accel_prog_model_1.2.pdf
82??????????
- OpenMP Application Program Interface Version 3.0,
May 2008. - http//www.openmp.org/mp-documents/spec30.pdf
- MPI A Message-Passing Interface Standard Version
2.2, September 2009. http//www.mpi-forum.org/docs
/mpi-2.2/mpi22-report.pdf - ???????????? ???????????????? ?? ????? C-DVM.
???????????? ??????? ?? ?????????? ??? ?????????
2-4 ??????. ??? ??. ?.?.??????????. ?????????
?M??. ??????, 2002 ?. ftp//ftp.keldysh.ru/K_stude
nt/DVM-practicum/method_CDVM_2006.doc - ??????? ?.?. ???????????? ???????????????? ?
?????????????? ?????????? OpenMP ???????
???????.-?. ???-?? ???, 2009. - http//parallel.ru/info/parallel/openmp/OpenM
P.pdf - ??????? ?.?. ???????????? ???????????????? ?
?????????????? ?????????? MPI ???????
???????.-?. ???-?? ???, 2004. - http//parallel.ru/tech/tech_dev/MPI/mpibook.pdf
- ???????? ?.?., ???????? ??.?. ????????????
??????????. ???. ???-?????????, 2002.
83????????