????: - PowerPoint PPT Presentation

About This Presentation
Title:

????:

Description:

1.2 30.12.2005 ... Title: Author – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 84
Provided by: 877454
Category:
Tags: openmp

less

Transcript and Presenter's Notes

Title: ????:


1
???? DVM-?????????? ?????????? ????????????
???????? ??? ?????????????? ?????????
?????? ???????? ????????????? ?.?.-?.?., ???.
???????? ????????? ?????????? ?????????? ??
?.?.??????? ??? ????????? ??????? ??????????
???????????????? ?????????? ??? ???????????
???????????????? ???????????? ??. ?.?.
?????????? bakhtin_at_keldysh.ru
2
??????????
  • ???????
  • MPI ?????? ???????? ?????????
  • DVM ?????? ???????????? ?? ?????? ? ??????????.
    ???? ??-DVM
  • ?????????
  • ???????????? ? ????????????? ??????????.
    SMP-????????
  • ????????? ?????? ???????????????? MPI/OpenMP
  • ???? Fortran-DVM/OpenMP
  • ???????
  • ????????? ?????????????? ????????
  • PGI Accelerator Model
  • ???? Fortran-DVM/OpenMP/Accelerator

3
???????? ?????. ???????????????? ??????
  • / Jacobi program /
  • include ltstdio.hgt
  • define L 1000
  • define ITMAX 100
  • int i,j,it
  • double ALL
  • double BLL
  • int main(int an, char as)
  • printf("JAC STARTED\n")
  • for(i0iltL-1i)
  • for(j0jltL-1j)
  • Aij0.
  • Bij1.ij

4
???????? ?????. ???????????????? ??????
  • / iteration loop
    /
  • for(it1 itltITMAXit)
  • for(i1iltL-2i)
  • for(j1jltL-2j)
  • Aij
    Bij
  • for(i1iltL-2i)
  • for(j1jltL-2j)
  • Bij
    (Ai-1jAi1jAij-1Aij1)/4.
  • return 0

5
???????? ?????. MPI-??????
6
???????? ?????. MPI-??????
  • / Jacobi-1d program /
  • include ltmath.hgt
  • include ltstdlib.hgt
  • include ltstdio.hgt
  • include "mpi.h"
  • define m_printf if (myrank0)printf
  • define L 1000
  • define ITMAX 100
  • int i,j,it,k
  • int ll,shift
  • double ( A)L
  • double ( B)L

7
???????? ?????. MPI-??????
  • int main(int argc, char argv)
  • MPI_Request req4
  • int myrank, ranksize
  • int startrow,lastrow,nrow
  • MPI_Status status4
  • double t1, t2, time
  • MPI_Init (argc, argv) / initialize MPI system
    /
  • MPI_Comm_rank(MPI_COMM_WORLD, myrank)/my place
    in MPI system/
  • MPI_Comm_size (MPI_COMM_WORLD, ranksize) /
    size of MPI system /
  • MPI_Barrier(MPI_COMM_WORLD)
  • / rows of matrix I have to process /
  • startrow (myrank L) / ranksize
  • lastrow (((myrank 1) L) / ranksize)-1
  • nrow lastrow - startrow 1
  • m_printf("JAC1 STARTED\n")

8
???????? ?????. MPI-??????
  • / dynamically allocate data structures /
  • A malloc ((nrow2) L sizeof(double))
  • B malloc ((nrow) L sizeof(double))
  • for(i1 iltnrow i)
  • for(j0 jltL-1 j)
  • Aij0.
  • Bi-1j1.startrowi-1j

9
???????? ?????. MPI-??????
  • / iteration loop
    /
  • t1MPI_Wtime()
  • for(it1 itltITMAX it)
  • for(i1 iltnrow i)
  • if (((i1)(myrank0))((inrow)(myrankra
    nksize-1))) continue
  • for(j1 jltL-2 j)
  • Aij Bi-1j

10
???????? ?????. MPI-??????
  • if(myrank!0)
  • MPI_Irecv(A00,L,MPI_DOUBLE, myrank-1,
    1235,

  • MPI_COMM_WORLD, req0)
  • if(myrank!ranksize-1)
  • MPI_Isend(Anrow0,L,MPI_DOUBLE,
    myrank1, 1235,

  • MPI_COMM_WORLD,req2)
  • if(myrank!ranksize-1)
  • MPI_Irecv(Anrow10,L,MPI_DOUBLE,
    myrank1, 1236,

  • MPI_COMM_WORLD, req3)
  • if(myrank!0)
  • MPI_Isend(A10,L,MPI_DOUBLE,
    myrank-1, 1236,

  • MPI_COMM_WORLD,req1)
  • ll4 shift0
  • if (myrank0) ll2shift2
  • if (myrankranksize-1) ll2
  • MPI_Waitall(ll,reqshift,status0)

11
???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue for(j1 jltL-2
j) Bi-1j (Ai-1jAi1j
Aij-1Aij1)/4. /DO
it/ printf("d Time of tasklf\n",myrank,MPI_W
time()-t1) MPI_Finalize () return 0
12
???????? ?????. DVM-??????
  • include ltstdio.hgt
  • define L 1000
  • define ITMAX 100
  • int i,j,it
  • define DVM(dvmdir)
  • define DO(v,l,h,s) for(v(l) vlt(h) v(s))
  • DVM(DISTRIBUTE BLOCKBLOCK) double ALL
  • DVM(ALIGN ij WITH Aij) double BLL
  • int main(int an, char as)
  • printf("JAC STARTED\n")
  • DVM(PARALLEL ij ON Aij)
  • DO(i,0,L-1,1)
  • DO(j,0,L-1,1)
  • Aij0.
  • Bij1.ij

13
???????? ?????. DVM-??????
  • / iteration loop
    /
  • for(it1 itltITMAXit)
  • DVM(PARALLEL ij ON Aij)
  • DO(i,1,L-2,1)
  • DO(j,1,L-2,1)
  • Aij Bij
  • DVM(PARALLEL ij ON Bij
    SHADOW_RENEW A)
  • DO(i,1,L-2,1)
  • DO(j,1,L-2,1)
  • Bij
    (Ai-1jAi1jAij-1Aij1)/4.
  • return 0

14
?????? ???????????? ?? ?????? ? ??????????. DVM
  • ??? ?????? (1993 ?.), ?????????? ? ?????? ??????
    ????????????? ???????????????? ???????-DVM ?
    ??-DVM, ?????????? ??????????? ??????
    ???????????? ?? ?????? ? ?????? ???????????? ??
    ??????????
  • ???????????? ?? ???? ?????? ??????? ??????????
    ???????????? ???????? (DVM) ??????? ? ??? ??.
    ?.?. ??????? ???
  • ???????????? DVM (Distributed Virtual Memory,
    Distributed Virtual Machine) ???????? ?????????
    ??????????? ????? ?????? ?? ??????????????
    ????????

15
?????? ???????????????? DVM
  • ?? ???????????? ??????????? ??????????????? ??
    ?????????? ??????? ??????????? ??????????
  • ??????????? ?????????? ????? (?????????) ??????,
    ?.?. ??????, ??????????? ?? ????? ??????????? ?
    ???????????? ?? ?????? ???????????
  • ??????????? ???????? ????? ? ????????????????
    ?????????, ??? ?????????? ?????????? ????????
    ????? ??????
  • ??????????? ???????????? ?? ???????????
    ??????????? ???????????? ?????? ?? ?????? ??????,
    ?? ? ??????????????? ??????????

16
?????? DVM-???????
  • DVM-??????? ??????? ?? ????????? ?????????
  • ?????????? Fortran-DVM/OpenMP
  • ?????????? C-DVM
  • ?????????? ????????? LIB-DVM
  • DVM-????????
  • ????????????? ?????????? DVM-????????
  • ?????????? ?????????????????? DVM-????????

17
???????? ????????????????
  • C-DVM ???? ?? ??????????? ???????
  • Fortran-DVM/OpenMP ???? ??????? 95
    ??????????? ???????????
  • ??????????? ??????????? ? ??????? ????????
    ???????????????? ?????????????? ???????????? ?
    ???????? ???????????????? ?????????
  • ??????????? ?????????????? ???????? ?????? ?
    ?????????????
  • ???????????????? ????? ????????????????
  • ???????????? ???????????? ???????? ???
    ??????????? ????????????
  • ?????????? ?????? ???? ????????? ????????? ???
    ????????????????? ? ????????????? ?????

18
??????????? ???????????????? ?????????

PARALLEL
?????
???????
PARALLEL
ALIGN
?????? ?????
DISTRIBUTE
???????
?????? ??????????? ???????????
MAP
DISTRIBUTE
?????????? ??????????
19
????????????? ??????. DISTRIBUTE
  • DVM( DISTRIBUTE f1fk ) lt????????-???????-??-????
    ?-??gt
  • ??? fi BLOCK - ????????????? ???????
    ??????? (?????????????? ?????????)
  • MULT_BLOCK(m) - ????????????? ???????
    ???????? m (?????????????? ?????????)
  • GENBLOCK ( block-array-name ) - ?????????????
    ??????? ????????? ????????
  • WGTBLOCK ( block-array-name,nblock ) -
    ????????????? ??????????? ???????
  • - ????????????? ????? ??????????
    (????????? ?????????)
  • k - ?????????? ????????? ???????

20
????????????? ??????. DISTRIBUTE
  • DVM(DISTRIBUTE BLOCK) double A12
  • DVM(DISTRIBUTE BLOCK) double B6
  • DVM(DISTRIBUTE MULT_BLOCK(3)) double
    A12
  • node1 node2 node3 node4
  • A 0,1,2 3,4,5 6,7,8 9,10,11
  • B 0,1 2,3 4 5
  • double wb61.,0.5,0.5,0.5,0.5,1.
  • int bs42,4,4,2
  • DVM(DISTRIBUTE GEN_BLOCK(bs)) double
    A12
  • DVM(DISTRIBUTE WGT_BLOCK(wb,6)) double
    B6
  • node1 node2 node3 node4
  • A 0,1 2,3,4,5 6,7,8,9 10,11
  • B 0 1,2 3,4 5

21
????????????? ??????. DISTRIBUTE
  • DVM(DISTRIBUTE BLOCK   BLOCK) float
    ANNN
  • dvm run M1 M2 lt???-?????????gt /P(M1,M2)/
  • ????????? ???????????? ?????? ????????? ??????? ?
    ?? ?????? ????????? P ??????? ??????? N/M1,
    ?????? ????????? ? - ?? ?????? ????????? P
    ??????? ??????? N/M2, ? ?????? ????????? ? ?????
    ??????? ???????????? ?? ?????? ???????????
    ?????????.
  • define DVM(dvmdir)

22
????????????? ?????????? ??? ???????????? ??????
  • ???? ????-????????? ??????????? ?? ????
  • ??????????? ? ????????? ?????????????
  • ??????
  • ??????? ??????????? ??????????
  • OWN(Ai) - ?????????, ?? ??????? ???????????
    Ai
  • Ai expri
  • ???????? ?????? ??????????? ?? ??????????
    OWN(Ai)
  • ????? ???? - ????? ???????? ?? ???????
    ??????????? ??????????

23
??????????? ??????. ALIGN
  • DVM(DISTRIBUTE BBLOCKBLOCK) float BNM1
  • . . .
  • for (i0 iltN-1i)
  • for(j0 jltM-2 j)
  • Bij1 Aij
  • DVM(ALIGN ij WITH Bij1) float ANM

24
??????????? ??????. ALIGN
  • DVM(ALIGN a1 an WITH B b1 bm )
    lt????????-???????-A-??-?????-??gt
  • ??? ai - ???????? i?? ????????? ??????????????
    ??????? ?
  • bj - ???????? j?? ????????? ???????? ??????? B
  • n - ?????????? ????????? ??????? ?
  • m - ?????????? ????????? ??????? ?
  • ai IDi bj cIDjd
  • ??? IDi , IDj - ??????????????
  • c, d - ????????????? ?????????

25
??????????? ??????. ALIGN
  • DVM(ALIGN i WITH B2i1 ) float AN
  • ???????????? ??????? A i ? B2i1 ?? ????
    ?????????.
  • DVM(ALIGN ij WITH Bji) float ANN
  • ???????????? ??????? A i j ? B j i
    ?? ???? ?????????.
  • DVM(ALIGN i WITH Bi) float AN
  • ???????????? ??????? A i ?? ?? ??????????,
    ??? ???????? ???? ?? ???? ??????? i-??? ???????
    B.
  • DVM(ALIGN i WITH Bi) float ANN
  • ???????????? i-?? ?????? A ? ??????? B i ??
    ???? ?????????, ?.?. ?????????? ?????? ?????????
    ??????? ?.

26
????????????? ?????? ?????. PARALLEL
  • ????????? ????????????? ????? ?? ??????
    ??????????? ??????? ??????????? (?????-?????????
    ????)
  • ????????? ?????????? ????????????? ????? ??
    ?????? ?????????? ? ???????? ?????????? ?????
    (????????????? ????????? ????????????)
  • ????? ????? ?????? ???? ????????? ???????? ?
    ??????????? ?? ????? ??????????. ??????? ?????
    ????? ?????????? ???????????? ?????? ????? ?????
    ?????? ???? ???????????? ?? ???? ?????????
    (???????????? ? ???????? ??????????? ??????????).

27
????????????? ?????? ?????. PARALLEL
  • FOR(i, N)
  • D2i
  • D2i1
  • DVM(PARALLEL i ON D2i)
  • FOR(i, N)
  • D2i
  • DVM(PARALLEL i ON D2i1)
  • FOR(i, N)
  • D2i1

28
??????????? ??????. TEMPLATE
  • DVM(PARALLEL i ON Ai)
  • FOR(i, N)
  • Ai Bid1 Ci-d2
  • DVM(DISTRIBUTE BLOCK TEMPLATE Nd1d2) void
    TABC
  • DVM( ALIGN i WITH TABCi ) float
    BN
  • DVM( ALIGN i WITH TABCid2 ) float
    AN
  • DVM( ALIGN i WITH TABCid1d2 ) float
    CN

29
????????? ?????? ???? SHADOW
  • DVM(DISTRIBUTE BLOCK) float AN
  • DVM(PARALLEL i ON Ai SHADOW_RENEW B )
  • FOR(i, N)
  • Ai Bid1 Bi-d2
  • DVM( ALIGN i WITH Ai SHADOW d1d2 )
    float BN

30
????????? ?????? ???? SHADOW
  • DVM(DISTRIBUTE BLOCKBLOCK) float C100100
  • DVM(ALIGNIJ WITH CIJ) float A100100,
    B100100, D100100
  • DVM(SHADOW_GROUP) void AB
  • . . .
  • DVM(CREATE_SHADOW_GROUP AB A B)
  • . . .
  • DVM(SHADOW_START AB)
  • . . .
  • DVM(PARALLELIJ ON CIJ SHADOW_WAIT AB)
  • DO( I, 1, 98, 1)
  • DO( J, 1, 98, 1)
  • CIJ (AI-1JAI1JAIJ-1A
    IJ1)/4.
  • DIJ (BI-1JBI1JBIJ-1B
    IJ1)/4.

31
????????? ?????? ???? ACROSS
  • DVM(DISTRIBUTE BLOCK SHADOW d1d2 ) float
    AN
  • DVM(PARALLEL i ON Ai ACROSS Ad1d2)
  • FOR(i, N)
  • Ai Aid1 Ai-d2

32
????????? ?????? ???? REMOTE
  • DVM(DISTRIBUTE BLOCK) float AN
  • DVM(PARALLEL i ON Ai REMOTE_ACCESS C5
    Cin)
  • FOR(i, N)
  • Ai C5 Cin

33
????????? ?????? ???? REMOTE
  • DVM (DISTRIBUTE BLOCKBLOCK) float
    A1MN11, A2M11N21, A3M21N21
  • DVM (REMOTE_GROUP) void RS
  • DO(ITER,1, MIT,1)
  • . . .
  • DVM (PREFETCH RS)
  • . . .
  • DVM ( PARALLELi ON A1iN1 REMOTE_ACCESS RS
    A2i1)
  • DO(i,0, M1-1,1)
  • A1iN1 A2i1
  • DVM (PARALLELi ON A1iN1 REMOTE_ACCESS RS
    A3i-M11)
  • DO(i,M1, M-1,1)
  • A1iN1 A3i-M11
  • DVM (PARALLELi ON A2i0 REMOTE_ACCESS RS
    A1IN1-1)
  • DO(i,0, M1-1,1)
  • A2i0 A1iN1-1
  • DVM(PARALLELi ON A3i0 REMOTE_ACCESS RS
    A1IM1N1-1)
  • DO (i,0, M2-1,1)
  • A3i0 A1iM1N1-1

34
????????? ?????? ???? REDUCTION
  • DVM(DISTRIBUTE BLOCK) float AN
  • DVM(PARALLEL i ON Ai REDUCTION SUM(S) )
  • FOR(i, N)
  • Ai Bi Ci
  • s s Ai
  • DVM( ALIGN i WITH Ai) float BN
  • DVM( ALIGN i WITH Ai) float CN
  • ? ???????????? ?????????? ?????????
  • SUM, PRODUCT, AND, OR, MAX, MIN, MAXLOC, MINLOC

35
????????? ?????? ???? REDUCTION
  • DVM(REDUCTION_GROUP) void RG
  • S 0 X A1 Y A1 MINI 1
  • DVM(PARALLELI ON AI REDUCTION RG SUM(S),
    MAX(X), MINLOC(Y,MIMI))
  • FOR(I, N)
  • S S AI
  • X max(X, AI)
  • if(AI lt Y)
  • Y AI
  • MINI I
  • DVM(REDUCTION_START RG)
  • DVM(PARALLELI ON BI)
  • FOR( I, N)
  • BI CI AI
  • DVM(REDUCTION_WAIT RG)

36
??????????? ?????? ????????
  • DVM(DISTRIBUTE BLOCK) float ANN
  • DVM(ALIGN ij WITH ji) float BNN
  • . . .
  • DVM(COPY)
  • FOR(i,N)
  • FOR(j,N)
  • BijAij

37
??????????? ?????? ????????
  • DVM(DISTRIBUTE BLOCK) float ANN
  • DVM(ALIGN ij WITH ji) float BNN
  • . . .
  • DVM(COPY_FLAG) void flag
  • . . .
  • DVM(COPY_START flag)
  • FOR(i,N)
  • FOR(j,N)
  • BijAij
  • . . .
  • DVM(COPY_WAIT flag)

38
????????? ?????? MPI/OpenMP
??????
??????????
MPI
??????
??????????
OpenMP

???? N
39
???????????? ? ????????????? ??????????
?????????? Intel Xeon ????? 5000
X5680 6 cores X5677 4 cores
3330 MHz 3460 MHz
?????????? Intel Xeon ????? 7000
X7560 8 cores X7542 6 cores
2226 MHz 2666 MHz
?????????? AMD Opteron ????? 4100
41KX HE 6 cores 41QS HE 4 cores
2200 MHz 2500 MHz
40
??????????? ????????????? ? ????? OpenMP ??????
MPI
  • ??????????? ???????????????? ?????????????????.
  • ????????? ???????????????? ? ????????????? ??
    ???????????? ???????????, ?????????? ???
    ?????? ???????.
  • ?????????? ??? ?????????? ???????????? ?????? ?
    ??????, ????????????? MPI-??????????.
  • ?????????????? ??????? ???????????? ?? OpenMP
    ??????????? ?????, ??? ?? MPI.

41
???????????? OpenMP ??? ???????????? ???????????
  • ?????? ??????????? ?????? ? ??? ??????,
    ???????????? ? ??????? ?? ???? ????, ?????
    ??????????? ???????? OpenMP ???????? ??????
    ?????????? ????? ?????.
  • ???? ?????????? ????? ???-??????, ??? ?????????
    ????????? ??? ??????????? ?????????.

42
National Institute for Computational Sciences.
University of Tennessee
  • ?????????????? Kraken Cray XT5-HE Opteron Six
    Core 2.6 GHz
  • 4 ????? ? TOP 500
  • http//nics.tennessee.edu
  • ??????? ?????????????????? - 1028.85 TFlop/s
  • ????? ???????????/???? ? ??????? 16 288 / 98
    928
  • ?????????????????? ?? Linpack - 831.7 TFlop/s
    (81 ?? ???????)
  • Updrage ?????? 4-? ??????? ??????????? AMD
    Opteron ?? 6-?? ??????? ?????????? AMD Opteron
  • ????????? 6-?? ????? ? TOP500 ? ???? 2009 -
    3-?? ????? ? TOP500 ? ?????? 2009

43
National Institute for Computational Sciences.
University of Tennessee
44
???????????????? ????????????????? ?????
?????????? ???????? ????
  • ?????????????? MVS-100K
  • 46 ????? ? TOP 500
  • http//www.jscc.ru/
  • ??????? ?????????????????? - 140.16 TFlop/s
  • ????? ???????????/???? ? ??????? 2 920/11 680
  • ?????????????????? ?? Linpack - 107.45 TFlop/s
    (76.7 ?? ???????)
  • Updrage ?????? 2-? ??????? ??????????? Intel
    Xeon 53xx ?? 4-? ??????? ?????????? Intel Xeon
    54xx
  • ????????? 57-?? ????? ? TOP500 ? ???? 2008 -
    36-?? ????? ? TOP500 ? ?????? 2008

45
Oak Ridge National Laboratory
  • ?????????????? Jaguar Cray XT5-HE Opteron Six
    Core 2.6 GHz
  • 1 ????? ? TOP 500
  • http//computing.ornl.gov
  • ??????? ?????????????????? - 2331 TFlop/s
  • ????? ???? ? ??????? 224 162
  • ?????????????????? ?? Linpack - 1759 TFlop/s
    (75.4 ?? ???????)
  • Updrage ?????? 4-? ??????? ??????????? AMD
    Opteron ?? 6-?? ??????? ?????????? AMD Opteron
  • ????????? 2-?? ????? ? TOP500 ? ???? 2009 -
    1-?? ????? ? TOP500 ? ?????? 2009

46
Oak Ridge National Laboratory
  • Jaguar Scheduling Policy

MIN Cores MAX Cores MAXIMUM WALL-TIME (HOURS)
135 000 24
45 000 134 999 24
4 500 44 999 12
1 250 4 499 6
1 1 249 2
47
Cray MPI ????????? ?? ?????????
MPI Environment Variable Name 1,000 PEs 10,000 PEs 50,000 PEs 100,000 Pes
MPI Environment Variable Name 128,000 Bytes 20,480 4096 2048
MPICH_UNEX_BUFFER_SIZE (The buffer allocated to hold the unexpected Eager data) 60 MB 60 MB 150 MB 260 MB
MPICH_PTL_UNEX_EVENTS (Portals generates two events for each unexpected message received) 20,480 events 22,000 110,000 220,000
MPICH_PTL_UNEX_EVENTS (Portals generates two events for each unexpected message received) 2048 events 2500 12,500 25,000
48
???????? ?????. ???????????????? ??????
  • / Jacobi program /
  • include ltstdio.hgt
  • define L 1000
  • define ITMAX 100
  • int i,j,it
  • double ALL
  • double BLL
  • int main(int an, char as)
  • printf("JAC STARTED\n")
  • for(i0iltL-1i)
  • for(j0jltL-1j)
  • Aij0.
  • Bij1.ij

49
???????? ?????. ???????????????? ??????
  • / iteration loop
    /
  • for(it1 itltITMAXit)
  • for(i1iltL-2i)
  • for(j1jltL-2j)
  • Aij
    Bij
  • for(i1iltL-2i)
  • for(j1jltL-2j)
  • Bij
    (Ai-1jAi1jAij-1Aij1)/4.
  • return 0

50
???????? ?????. MPI-??????
51
???????? ?????. MPI-??????
  • / Jacobi-1d program /
  • include ltmath.hgt
  • include ltstdlib.hgt
  • include ltstdio.hgt
  • include "mpi.h"
  • define m_printf if (myrank0)printf
  • define L 1000
  • define ITMAX 100
  • int i,j,it,k
  • int ll,shift
  • double ( A)L
  • double ( B)L

52
???????? ?????. MPI-??????
  • int main(int argc, char argv)
  • MPI_Request req4
  • int myrank, ranksize
  • int startrow,lastrow,nrow
  • MPI_Status status4
  • double t1, t2, time
  • MPI_Init (argc, argv) / initialize MPI system
    /
  • MPI_Comm_rank(MPI_COMM_WORLD, myrank)/my place
    in MPI system/
  • MPI_Comm_size (MPI_COMM_WORLD, ranksize) /
    size of MPI system /
  • MPI_Barrier(MPI_COMM_WORLD)
  • / rows of matrix I have to process /
  • startrow (myrank L) / ranksize
  • lastrow (((myrank 1) L) / ranksize)-1
  • nrow lastrow - startrow 1
  • m_printf("JAC1 STARTED\n")

53
???????? ?????. MPI-??????
  • / dynamically allocate data structures /
  • A malloc ((nrow2) L sizeof(double))
  • B malloc ((nrow) L sizeof(double))
  • for(i1 iltnrow i)
  • for(j0 jltL-1 j)
  • Aij0.
  • Bi-1j1.startrowi-1j

54
???????? ?????. MPI-??????
  • / iteration loop
    /
  • t1MPI_Wtime()
  • for(it1 itltITMAX it)
  • for(i1 iltnrow i)
  • if (((i1)(myrank0))((inrow)(myrankra
    nksize-1))) continue
  • for(j1 jltL-2 j)
  • Aij Bi-1j

55
???????? ?????. MPI-??????
  • if(myrank!0)
  • MPI_Irecv(A00,L,MPI_DOUBLE,
    myrank-1, 1235,
  • MPI_COMM_WORLD, req0)
  • if(myrank!ranksize-1)
  • MPI_Isend(Anrow0,L,MPI_DOUBLE,
    myrank1, 1235,
  • MPI_COMM_WORLD,req2)
  • if(myrank!ranksize-1)
  • MPI_Irecv(Anrow10,L,MPI_DOUBLE, myrank1,
    1236, MPI_COMM_WORLD, req3)
  • if(myrank!0)
  • MPI_Isend(A10,L,MPI_DOUBLE, myrank-1, 1236,
    MPI_COMM_WORLD,req1)
  • ll4 shift0
  • if (myrank0) ll2shift2
  • if (myrankranksize-1) ll2
  • MPI_Waitall(ll,reqshift,status0)

56
???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue for(j1 jltL-2
j) Bi-1j (Ai-1jAi1j
Aij-1Aij1)/4. /DO
it/ printf("d Time of tasklf\n",myrank,MPI_W
time()-t1) MPI_Finalize () return 0
57
???????? ?????. MPI-??????
58
???????? ?????. MPI-??????
/Jacobi-2d program / include ltmath.hgt include
ltstdlib.hgt include ltstdio.hgt include
"mpi.h" define m_printf if (myrank0)printf def
ine L 1000 define LC 2 define ITMAX 100 int
i,j,it,k double ( A)L/LC2 double (
B)L/LC
59
???????? ?????. MPI-??????
int main(int argc, char argv) MPI_Request
req8 int myrank, ranksize int
srow,lrow,nrow,scol,lcol,ncol MPI_Status
status8 double t1 int isper 0,0 int
dim2 int coords2 MPI_Comm
newcomm MPI_Datatype vectype int
pleft,pright, pdown,pup MPI_Init (argc,
argv) / initialize MPI system
/ MPI_Comm_size (MPI_COMM_WORLD, ranksize)
/ size of MPI system / MPI_Comm_rank
(MPI_COMM_WORLD, myrank) / my place in MPI
system /
60
???????? ?????. MPI-??????
dim0ranksize/LC dim1LC if
((Ldim0)(Ldim1)) m_printf("ERROR
arraydd is not distributed on dd
processors\n",L,L,dim0,dim1) MPI_Finalize()
exit(1) MPI_Cart_create(MPI_COMM_WORLD,2,
dim,isper,1,newcomm) MPI_Cart_shift(newcomm,0,1,
pup,pdown) MPI_Cart_shift(newcomm,1,1,pleft,
pright) MPI_Comm_rank (newcomm, myrank) /
my place in MPI system / MPI_Cart_coords(newcomm,
myrank,2,coords)
61
???????? ?????. MPI-??????
/ rows of matrix I have to process / srow
(coords0 L) / dim0 lrow (((coords0
1) L) / dim0)-1 nrow lrow - srow 1 /
columns of matrix I have to process / scol
(coords1 L) / dim1 lcol (((coords1
1) L) / dim1)-1 ncol lcol - scol
1 MPI_Type_vector(nrow,1,ncol2,MPI_DOUBLE,vecty
pe) MPI_Type_commit(vectype) m_printf("JAC2
STARTED on dd processors with dd array,
itd\n",dim0,dim1,L,L,ITMAX) /
dynamically allocate data structures / A
malloc ((nrow2) (ncol2) sizeof(double)) B
malloc (nrow ncol sizeof(double))
62
???????? ?????. MPI-??????
for(i0 iltnrow-1 i) for(j0
jltncol-1 j) Ai1j10. Bij
1.srowiscolj / iteration loop
/ MPI_Barrier(newcomm)
t1MPI_Wtime() for(it1 itltITMAX
it) for(i0 iltnrow-1 i)
if (((i0)(pupMPI_PROC_NULL))((inrow-
1)(pdownMPI_PROC_NULL))) continue
for(j0 jltncol-1 j)
if (((j0)(pleftMPI_PROC_NULL))((jnc
ol-1)(prightMPI_PROC_NULL))) continue
Ai1j1 Bij
63
???????? ?????. MPI-??????
MPI_Irecv(A01,ncol,MPI_DOUBLE, pup,
1235, MPI_COMM_WORLD, req0) MPI_Isend(Anro
w1,ncol,MPI_DOUBLE, pdown, 1235,
MPI_COMM_WORLD,req1) MPI_Irecv(Anrow11
,ncol,MPI_DOUBLE, pdown, 1236,
MPI_COMM_WORLD, req2) MPI_Isend(A11,nc
ol,MPI_DOUBLE, pup, 1236,
MPI_COMM_WORLD,req3) MPI_Irecv(A10,1,ve
ctype, pleft, 1237, MPI_COMM_WORLD,
req4) MPI_Isend(A1ncol,1,vectype,
pright, 1237, MPI_COMM_WORLD,req5) MPI_Ir
ecv(A1ncol1,1,vectype, pright,
1238, MPI_COMM_WORLD, req6) MPI_Isend(A1
1,1,vectype, pleft, 1238,
MPI_COMM_WORLD,req7) MPI_Waitall(8,req,statu
s)
64
???????? ?????. MPI-??????
for(i1 iltnrow i) if
(((i1)(pupMPI_PROC_NULL))
((inrow)(pdownMPI_PROC_NULL)))
continue for(j1 jltncol
j) if (((j1)(pleftMPI_PROC_NULL
)) ((jncol)(prightMPI_PROC_NULL)))
continue Bi-1j-1 (Ai-1jAi1j
Aij-1Aij1)/4.
printf("d Time of tasklf\n",myrank,MPI_Wtime()
-t1) MPI_Finalize () return
0
65
???????? ?????. MPI/OpenMP-??????
  • / iteration loop
    /
  • t1MPI_Wtime()
  • pragma omp parallel default(none)
    private(it,i,j) shared (A,B,myrank,
    nrow,ranksize,ll,shift,req,status)
  • for(it1 itltITMAX it)
  • for(i1 iltnrow i)
  • if (((i1)(myrank0))((inrow)(myrankra
    nksize-1))) continue
  • pragma omp for nowait
  • for(j1 jltL-2 j)
  • Aij Bi-1j

66
???????? ?????. MPI/OpenMP-??????
  • pragma omp barrier
  • pragma omp single
  • if(myrank!0)
  • MPI_Irecv(A00,L,MPI_DOUBLE, myrank-1, 1235,
    MPI_COMM_WORLD, req0)
  • if(myrank!ranksize-1)
  • MPI_Isend(Anrow0,L,MPI_DOUBLE, myrank1,
    1235, MPI_COMM_WORLD,req2)
  • if(myrank!ranksize-1)
  • MPI_Irecv(Anrow10,L,MPI_DOUBLE, myrank1,
    1236, MPI_COMM_WORLD, req3)
  • if(myrank!0)
  • MPI_Isend(A10,L,MPI_DOUBLE, myrank-1, 1236,
    MPI_COMM_WORLD,req1)
  • ll4 shift0 if (myrank0) ll2shift2
  • if (myrankranksize-1) ll2
  • MPI_Waitall(ll,reqshift,status0)

67
???????? ?????. MPI/OpenMP-??????
for(i1 iltnrow i) if
(((i1)(myrank0))((inrow)(myrankranks
ize-1))) continue pragma omp for
nowait for(j1 jltL-2 j) Bi-1j
(Ai-1jAi1j Aij-1Aij1)/4
. /DO it/ printf("d Time of
tasklf\n",myrank,MPI_Wtime()-t1)
MPI_Finalize () return 0
68
????????? ?????? DVM/OpenMP
??????
DVM
DVM
??????
??????????
OpenMP

???? N
69
???????? ?????. DVM/OpenMP-??????
  • PROGRAM JAC_OpenMP_DVM
  • PARAMETER (L1000, ITMAX100)
  • REAL A(L,L), B(L,L)
  • CDVM DISTRIBUTE ( BLOCK, BLOCK) A
  • CDVM ALIGN B(I,J) WITH A(I,J)
  • PRINT , '
    TEST_JACOBI '
  • COMP PARALLEL DEFAULT(NONE ) SHARED(A,B)
    PRIVATE(IT,I,J)
  • DO IT 1, ITMAX
  • CDVM PARALLEL (J,I) ON A(I, J)
  • DO J 2, L-1
  • COMP DO
  • DO I 2, L-1
  • A(I, J)
    B(I, J)
  • ENDDO
  • COMP ENDDO NOWAIT
  • ENDDO

70
???????? ?????. DVM/OpenMP-??????
  • COMP BARRIER
  • CDVM PARALLEL (J,I) ON B(I, J),
    SHADOW_RENEW (A)
  • DO J 2, L-1
  • COMP DO
  • DO I 2, L-1
  • B(I, J)
    (A(I-1, J) A(I, J-1) A(I1, J) A(I, J1))
    / 4
  • ENDDO
  • COMP ENDDO NOWAIT
  • ENDDO
  • ENDDO
  • COMP END PARALLEL
  • END

71
????? NASA MultiZone
BT (Block Tridiagonal Solver) 3D ?????-?????,
????? ?????????? ??????????? LU (Lower-Upper
Solver) 3D ?????-?????, ????? ??????? ??????????
SP (Scalar PentadiagonalSolver) 3D ?????-?????,
Beam-Warning approximate factorization http//www
.nas.nasa.gov/News/Techreports/2003/PDF/nas-03-010
.pdf
72
????? NASA MultiZone
73
???? SP-MZ (????? A) ?? IBM eServer pSeries 690
Regatta
DVM
MPI
74
???? LU-MZ (????? A) ?? IBM eServer pSeries 690
Regatta
DVM
MPI
75
???? BT-MZ (????? A) ?? IBM eServer pSeries 690
Regatta???? ?? 13 x 13 x 16 ? ?? 58 x 58 x 16
DVM
MPI
76
???????????? ????????? ?????? MPI/OpenMP
  • ?????????? ??? ?????????? ???????????? ?????? ?
    ?????? ????.
  • ?????????????? ??????? ???????????? ?? OpenMP
    ??????????? ?????, ??? ?? MPI (????????, ????? ?
    ????????? ???? ??? ?????? ????????????
    ??????????? ????? ??????????? ? ???????????
    ?????? ?????????).
  • ????????? ???????????? ?? ???????????? ???????
    ??? ??????? ???????????? ?????????? ??? ??????
    ?????? ????????????.

77
???????????? ????????? ?????? DVM/OpenMP
  • ???????? ??????? OpenMP ? DVM, ??? ???????? ??
    ?????????? ?????????????.
  • ????????? ?????? ????????, ?????????
    ????????????? ?? ?????????????? ?????
    SMP-????????.
  • ??????????? ????????????? ???????????? ?????????
    ??? ????????????????, ??? OpenMP-?????????, ???
    DVM-?????????, ? ??? DVM/OpenMP -?????????.

78
????????? ?????????????? ???????
????????? ?????????????? ??????? ???
???????????? ?????????????? ???????, ? ???????,
?????? ? ?????????????? ????????????,
???????????? ????????? ???? ???????????, ???
?????????????, ?????????? ?????????????
???????????? (?????????, ??????????????,
????????????????? ??? ?????????? ?????? ? ?. ?.).
? ?????? ?????? ?????????? ????? ?????????????,
????????? ???????? ?????????? ????????????.
79
?????????????? ???????? ???-???????
80
?????? PGI Accelerator ??? Fortran ? ??
!acc data region copy(a(1n,1m))
local(b(2n-1,2m-1)) copyin(w(2n-1)) do
while(resid .gt. tol) resid 0.0 !acc
region do i 2, n-1 do j 2, m-1
b(i,j) 0.25w(i)(a(i-1,j)a(i,j-1)

a(i1,j)a(i,j1))
(1.0-w(i))a(i,j) enddo enddo
81
?????? PGI Accelerator ??? Fortran ? ??
do i 2, n-1 do j 2, m-1
resid resid (b(i,j)-a(i,j))2
a(i,j) b(i,j) enddo enddo
!acc end region enddo !acc end data
region http//www.pgroup.com/lit/whitepapers/pgi_
accel_prog_model_1.2.pdf
82
??????????
  • OpenMP Application Program Interface Version 3.0,
    May 2008.
  • http//www.openmp.org/mp-documents/spec30.pdf
  • MPI A Message-Passing Interface Standard Version
    2.2, September 2009. http//www.mpi-forum.org/docs
    /mpi-2.2/mpi22-report.pdf
  • ???????????? ???????????????? ?? ????? C-DVM.
    ???????????? ??????? ?? ?????????? ??? ?????????
    2-4 ??????. ??? ??. ?.?.??????????. ?????????
    ?M??. ??????, 2002 ?. ftp//ftp.keldysh.ru/K_stude
    nt/DVM-practicum/method_CDVM_2006.doc
  • ??????? ?.?. ???????????? ???????????????? ?
    ?????????????? ?????????? OpenMP ???????
    ???????.-?. ???-?? ???, 2009.
  • http//parallel.ru/info/parallel/openmp/OpenM
    P.pdf
  • ??????? ?.?. ???????????? ???????????????? ?
    ?????????????? ?????????? MPI ???????
    ???????.-?. ???-?? ???, 2004.
  • http//parallel.ru/tech/tech_dev/MPI/mpibook.pdf
  • ???????? ?.?., ???????? ??.?. ????????????
    ??????????. ???. ???-?????????, 2002.

83
????????
Write a Comment
User Comments (0)
About PowerShow.com