Title: ???:Intel? ?????(MKL)
1???Intel? ?????(MKL)
??? http//www.njyangqs.com/
????????????
2??
3???
- ??? Intel MKL
- Intel ?????????????/?????
- ?Intel processors??
- ????????SMP?????????
4??
- Intel???????
- ??,??,????!
- Intel??????????????
- ??
- ???? (BLAS, LAPACK)
- ????/?????(BLAS, LAPACK)
- ?????????(dgemm)
- ?????, ????, ??, ????(FFTs)
- ????,?????????? (VML)?????????(VSL)
- ?Intel????????????
5??
- Intel ??????? ????
- ?????Intel ?????(Intel MKL) ??
- ????????Intel MKL
- ????n?????????.
X Y Z W
X Y Z W
4x4 ????
????
6??
- Intel ???????
- BLAS (?????????)
- Level 1 BLAS ??-????
- 15?????
- 48???
- Level 2 BLAS ??-????
- 26?????
- 66???
- Level 3 BLAS ??-????
- 9?????
- 30???
- Extended BLAS ??????level 1 BLAS
- 8?????
- 24???
7??
- Intel ???????
- LAPACK (?????)
- ???????????. ????????
- ?????1000???????????
- DFTs (???????)
- ????, ????
- ???
- VML (?????)
- ???????
- ????libm??,????
- VSL (??????)
- ?????????
8??
- Intel ???????
- BLAS?LAPACK ??Fortran.
- ????????
- VSL?VML?Fortran?C??
- DFTs?Fortran 95?C??
- cblas?????C/C?????BLAS
9??
- Intel ?????(Intel MKL) ??
- ??32??64?Intel???
- ???????????
- ?????
Windows Linux
??? Intel, CVF, Microsoft Intel, Gnu
? .dll, .lib .a, .so
10??
11????
- ???????
- ?????????????
- ??????? ?????????????
- CPU ?????, ????
- Cache ???????cache???? ??cache??
- TLBs ?????????????
- ???? ??????????.
- ??? ?????????????
- ?? ?????????(????).
12????
- ???
- ???Intel ?????(Intel MKL) ?????,??
- ??????????
- ???level 1?level 2 BLAS ??????( O(n) )
- ??????????
- Level 3 BLAS ( O(n3) )
- LAPACK ( O(n3) )
- FFTs ( O(n log(n) )
- VML, VSL ? ????????
- ??????????OpenMP.
- ??Intel MKL????????????????
13??
14???
- ??Intel ?????(Intel MKL)
- ??1 ifort, BLAS, IA-32???
- ifort myprog.f mkl_c.lib
- ??2 CVF, LAPACK, IA-32???
- f77 myprog.f mkl_s.lib
- ??3 ???????????DLL???????C??
- link myprog.obj mkl_c_dll.lib
- ????????????????????????
15???
Roll Your Own
for( i 0 i lt n i ) for( j 0 j lt m
j ) for( k 0 k lt kk k ) cij
aik bkj
ddot
for( i 0 i lt n i ) for( j 0 j lt m
j ) cij cblas_ddot( n, ai,
incx,b0j, incy)
16???
dgemv
for( i 0 i lt n i ) cblas_dgemv(
CBLAS_RowMajor, CBLAS_NoTrans, m, n,
alpha, a, lda, b0i, ldb, beta,
c0i, ldc )
dgemm
Cblas_dgemm( CblasColMajor, CblasNoTrans,
CblasNoTrans, m, n, kk, alpha, b, ldb, a, lda,
beta, c, ldc )
17???
- ?? 1 DGEMM
- ?????C??,DDOT,DGEMG?DGEMM?????????
- ???MKL/BLAS????????
18??
19???
- LAPACK???Intel ?????
- ?????LAPACK??
- ??? ????????????
- ????
- ??????(Amdahls law t t?? t??/p)
- ????????????????,??????
- ?????????
- NETLIB LAPACK????????????????????,??????Intel
MKL?,??????????,??????????????
20???
- ???????(DFTs)
- 1?,2?,3?(???????)
- ?????
- ?????????????????,????????????????????????,???????
?????????? - ??????,?????
- ???????,????????????
- C?F90??
21???
- ?Intel ?????????????
- ????3???
- ???????.
- Status DftiCreateDescriptor(MDH, )
- ?????(????).
- Status DftiCommitDescriptor(MDH)
- ????.
- Status DftiComputeForward(MDH, X)
- ?????(???)
22???
- ?????(VML)???/??
- ????? ?????? ?libm,????(??)
- ?? ?Fortran?C????
- ?????
- ???( lt 1 ulp )
- ????, ??( lt 4 ulps )
- ??????v(-a), sin(0),?
- ???? ??????libm
23???
- VML ?????
- ?????????(???????).
- ??, ??
- ??????????????
- ????????????????????.
24???
- ??????(VSL)
- ???????(RNGs)
- ????????
- VSL???????????
- ??????? ????
- ?????????BRNG???
- 5????RNGs (BRNGs) ?,??,???
- MCG31, R250, MRG32, MCG59, WH
25???
- ???RNGs
- Gaussian (two methods) ??
- Exponential ??
- Laplace ??????
- Weibull ????
- Cauchy ????
- Rayleigh ??
- Lognormal ????
- Gumbel Gumbel??
26???
- ??VSL
- ???????
- ???????
- VSLStreamStatePtr stream
- ?????.
- vslNewStream(stream,VSL_BRNG_MC_G31, seed )
- ????RNGs.
- vsRngUniform( 0, stream, size, out, start, end
) - ?????(??).
- vslDeleteStream(stream)
27???
- ?? ????????p
- ???????,Spr2
- ??????,??,1/4??????????????????,???????1,??????
??????????,???1/4????????????1/4??????? - ??????1/4???????????????????? x2y21
1
-1
1
0
-1
28???
int main() unsigned int iter200000000
int i,j double x, y double
dUnderCurve0.0 double pi0.0 srand( 0)
for (i0iltiteri) x(double)rand()/(d
ouble)RAND_MAX y(double)rand()/(double)R
AND_MAX if (xx yy lt 1.0)
dUnderCurve pi dUnderCurve /
(double) iter 4 return 0
1
-1
1
0
-1
29???
unsigned int iter200000000 int i,j double
x, y double dUnderCurve0.0 double pi0.0
double rBLOCK_SIZE2
VSLStreamStatePtr stream vslNewStream(
stream, BRNG, (int) clock() )
for(j0jltiter/BLOCK_SIZE j)
vdRngUniform( METHOD, stream, BLOCK_SIZE2, r,
0.0, 1.0 ) for (i0 iltBLOCK_SIZE i)
xri
yriBLOCK_SIZE if (xx yy
lt 1.0) dUnderCurve
vslDeleteStream( stream
) pi dUnderCurve / (double) iter 4
1
-1
1
0
-1