Build Environment - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Build Environment

Description:

HP-UX hardware and runtime environment (7) ARIES ... read man mpidebug for MPI debugging instructions. page 36. 6/10/09. Queens University Belfast ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 54
Provided by: sabinebu
Category:

less

Transcript and Presenter's Notes

Title: Build Environment


1
(No Transcript)
2
Build Environment Performance Tuning for
Itanium 2 Processor with emphasis on HP-UX
  • Frank Haase
  • Michael Riedmann
  • European Performance Centre
  • (aka Benchmark Centre)
  • Boeblingen - Germany

3
Agenda
  • Compilers - understand how to use them
    effectively for high performance
  • HP and Intel compilers
  • compiler options
  • directives/pragmas
  • Debugging
  • WDB/GDB, HP Caliper
  • MPI
  • Using HP MPI on SMP and Clusters

4
HP and Intel Compilers
  • see http//www.hp.com/go/lang

5
Performance Expectations
  • ITANIUM-2 vs PA-RISCperformance ratio1.5 X
    2.5 X
  • ITANIUM-2 vs ALPHAperformance ratio1.0 X 1.2
    X
  • If these ratios cant be achieved in a
    benchmarkthen something is wrong.

6
HP-UX hardware and runtime environment (1) Data
models
  • Evolution from 32 bit to 64 bit. Kernel is 64
    bit. User processes can be either 32 or 64
    bit.Benefit No forced 64 bit migration from 32
    bit platforms.
  • Compiler default is 32 bit.For 64 bit data model
    use DD64 on IPFDA2.0W on PA-RISC
  • 64 bit model is LP6432 bit model is
    ILP32Caution with long in C !Use long long to
    always get 64 bit integers or use size_t
    whenever possible.

7
HP-UX hardware and runtime environment (2) Data
models and system libraries
  • Each system library and API is available in 4
    flavours in seperate subdirectories
  • /usr/lib/pa1.1 PA-RISC 32bit
  • /usr/lib/pa20_64 PA-RISC 64bit
  • /usr/lib/hpux32 IPF 32bit
  • /usr/lib/hpux64 IPF 64bit
  • Same thing for extra products like MPI, MLIB, ...
  • Object format is ELF32 and ELF 64 except for
  • PA-RISC 32bit (DA2.0 - SOM)
  • Mixed linking is impossible. Linker returns
    explicit message.

8
HP-UX hardware and runtime environment (3) Data
alignment
  • LINUX is little endian
  • HP-UX is big endian with both IPF and PA-RISC.
    Binary data compatibility with MIPS, SPARC,
    POWERBinary data incompatibility withALPHA,
    IA-32, LINUX
  • Data alignment is very similar between Tru64 and
    HP-UXUnaligned access on HP-UX causes SIGBUS

9
HP-UX hardware and runtime environment (4)
Exception handling
  • HP-UX ignores FP exceptions by default. Link with
    FPDVONZ for Tru64 like behaviour.API for
    runtime control of FPU is defined in
    /usr/include/fenv.h
  • NULL pointer dereference by default returns
    0.Link with z for Tru64 like SIGSEGV generation
  • malloc(0) returns a valid pointer

10
HP-UX hardware and runtime environment (5) Memory
page size
  • HP-UX supports variable sized pages / large
    pages. HW page size is still 4k.
  • Page size is a property of the executable and can
    be modified with the chatr commandchatr pd 4M
    pi 4M chatr pd L pi D
  • Large pages can drastically reduce TLB
    misses.Many HPTC apps get a huge performance
    boost from large pages.

11
HP-UX hardware and runtime environment (6) Large
files
  • Large files are default for HP-UX / IPF but not
    PA-RISC
  • On PA-RISC use o largefiles for newfs and mount
  • Some HP-UX commands dont support large files,
    e.g. tar, cpio and pax fail to backup large
    files, same problem with some open source tools
  • Rebuild 32 bit programs with-D_LARGEFILE64_SOURCE
  • No problem with 64 bit programs

12
HP-UX hardware and runtime environment (7) ARIES
  • PA-RISC binaries can be run on IPF through
    dynamic translation
  • Slowdown is 3X for GUIs up to 10X for solvers
  • The slowdown is hardly noticable with interactive
    tools like Vim, Netscape, Acroread
  • Many HP-UX tools on IPF like SAM are still
    PA-RISC binaries
  • Use file command to identify the nature of an
    executable
  • IPF migration approach with ISVs
  • Rebuild solvers first
  • Rebuild GUIs, pre post later

13
HP-UX Compilers C/C
  • /opt/aCC/bin/aCC /opt/ansic/bin/cc is the ANSI
    C/C compiler
  • -AA ANSI C with namespace std and new C
    standard library. This is the default.
  • -AP Turn off AA and use older classic C
    runtime libraries. Very useful for porting
    legacy and open source codes.
  • -Aa strict ANSI C (TRU64 -std1)
  • -Ae ANSI C with extensions (TRU64 -std)
  • no support for KR mode
  • instantiation files are written to a repository
    on TRU64,to the object file on HP-UX

14
HP-UX Compilers Fortran (77),90,95
  • /opt/fortran90/bin/f90 is the Fortran 90/95
    compiler
  • Supports native OpenMP 1.1 and legacy CONVEX/HP
    directives Oopenmp enable OpenMP
    directives Oparallel O3 enable legacy
    CONVEX/HP directives
  • Legacy f77 compiler is obsolete, f90 handles f77
    codes very well.Use U77 to enable BSD 3f
    intrinsics
  • f90 adds trailing underscore to function names on
    IPF and PA-RISC 64 bit. No trailing underscores
    for PA-RISC 32 bit. Explicit control with ppu
    add trailing underscore noppu do not add
    trailing underscore
  • Pragma exampleDEC ALIAS ? HP ALIAS

15
HP-UX Compilers Mixing Languages
  • HP-UX compiler drivers do NOT recognize other
    languages,need to compile C and F programs
    separately
  • Make sure C symbols are lowercase and have a
    trailing underscore or compile F sources with
    noppu
  • If aCC or ld is used for linkingFORTRAN
    libraries have to be passed explicitly to the
    linker(libF90.a, libIO77.a, -lm, -lc)
  • If f90 is used for linking it will find its
    libraries automatically
  • what returns exact compiler
    version string

16
Intel V7.0 Linux Compilers
  • /opt/intel/compiler points to latest compiler
  • efc Fortran
  • /opt/intel/compiler70/ia64/bin/efc
  • ecc C, C
  • /opt/intel/compiler70/ia64/bin/ecc
  • Source subset of following to set up
    environment/opt/intel/compiler70/ia64/bin/eccvars
    .csh,sh/opt/intel/compiler70/ia64/bin/efcvars.
    csh,sh
  • Useful web page - http//www.intel.com/software/pr
    oducts/compilers/

17
CompilersDirectives and Pragmas
  • HP Fortran compiler directive
  • cdir
  • HP C compiler pragma
  • pragma _cnx
  • (note blank between pragma and _cnx above)
  • OpenMP directives
  • comp
  • Preferred for directive based parallelism
  • COMP parallel do private(x,y) shared(z)
  • Intel compiler directives
  • cdec
  • cdir

18
HP-UX and Intel Linux CompilersArchitecture and
Data Model Switches
  • DA2.0N DS2.0 PA-RISC (2.0) 32 bit
  • DA2.0W DS2.0 PA-RISC (2.0) 64 bit
  • DSmckinley DD32 ITANIUM-2 32 bit
  • DSmckinley DD64 ITANIUM-2 64 bit
  • (DSmckinley DSitanium2)
  • -tpp2 ITANIUM-2 64 bit with INTEL compiler
  • Recommendations
  • DSmckinley and -tpp2 are also the best choice
    for Madison code.
  • DSitanium, DSblended and tpp1 should be used
    only if target is ITANIUM-1. This code performs
    20 slower on ITANIUM-2.

19
Intel Linux CompilersOptimisation Levels
  • -O2
  • very safe, register rotation, no extra unrolling,
    no prefetch instructions
  • -O3
  • usually safe, lots of optimizations including
    load word pair generation, up to 8-way unrolling,
    prefetch instructions

20
HP-UX CompilersOptimisation Levels (1)
  • O0 Default, minimal optimisation Fastest
    compile time Good debugging support
  • O1 Basic block level optimisation Pretty fast
    compile time, Improved runtime performance Go
    od debugging support
  • O2 Full routine level optimisation Register
    rotation and data prefetching Limited debugging
    support, Good runtime performance Inlining
    for sqrt
  • O2 is sufficient for most FORTRAN codes

21
HP-UX CompilersOptimisation Levels (2)
  • O3 Full source file level optimisation No
    debugging support (-g is invalid) Adds
    subroutine cloning and inlining (only within the
    source file) Adds transformations for nested
    loops Inlines all math intrinsics on
    IPF Matches and inlines inverse square roots if
    Ofltaccrelaxed Use Oinfo or Oreportall for
    optimisation report
  • O3 is not always better than O2. Use it
    deliberately for
  • inlining of math intrinsics and frequently called
    routines
  • transformation of nested loops
  • optimized inverse square roots (e.g. quantum
    chemistry)

22
HP-UX CompilersGlobal and Profile Based
Optimisation
  • O4 Performs global optimisation at link time.
    Can be combined with Profile Based Optimisation
    (PBO).
  • Oprofilecollect Make an instrumented
    executable for profiling. After execution it
    will dump the data in flow.data
  • Oprofileuse Use profile data from flow.data
    and use it for global optimisation
  • O4 and PBO is most useful for C and C as it
    provides global inlining capability and reduces
    branch mispredictionThe benefit for FORTRAN
    codes is very limited due to common programming
    practices.

23
HP-UX CompilersPrefetching
  • Onodataprefetchdirectindirectnone
  • Control generation of data prefetch instructions
    for data structures referenced within inner most
    loops. The defined values for kind are
  • direct Enable generation of data prefetch
    instructions for the benefit of direct memory
    accesses, but not indirect memory accesses.
  • indirect Enable generation of data prefetch
    instructions for the benefit of both direct and
    indirect memory accesses. This is the default at
    optimization levels O2 and above.
  • none Disable generation of data prefetch
    instructions. This is the default at
    optimization levels O1 and below.

24
HP-UX and Intel Linux CompilersFortran prefetch
directives
  • HP-UX cdir prefetch (expression)
  • no special compile options needed
  • Intel cdir noprefetch A,B,..
  • Allows user to prefetch explicitly where the
    compiler fails e.g. when addresses are computed
  • do i 1,n
  • ia func(i)
  • cdir prefetch b(func(i50))
  • b(ia) b(ia)a(i)
  • enddo

25
HP-UX CompilersFloating-Point Accuracy
  • Ofltaccstrictdefaultlimitedrelaxed
  • Control the level of FP optimizations that the
    compiler may perform.
  • Useful for debugging when there are numerical
    instabilities
  • defaultAllow contractions, such as fused
    multiply-add (FMA), but disallows any other
    optimization that can result in numerical
    differences.
  • limitedLike default, but also allows floating
    point optimizations which may affect the
    generation and propagation of infinities, NaNs,
    and the sign of zero.
  • relaxedIn addition to the optimizations allowed
    by limited, permits optimizations, such as
    reordering of expressions, even if parenthesized,
    that may affect a rounding error. This is the
    same asOnofltacc.
  • strictDisallow any floating point optimization
    that can result in numerical differences. This
    is the same as Ofltacc.

26
Intel Linux CompilersFloating-Point Accuracy
  • -IPF_fma- (-IPF_fma- to turn off fma
    generation)
  • Enable/disable the combining of floating point
    multiplies and add / subtract operations. Note
    fmas are still generated but each corresponds to
    either an fmpy (fma x,y,f0) instruction or an
    fadd (fma x,f1,y) instructions
  • -IPF_fltacc-
  • Enable / disable optimizations that affect
    floating point accuracy

27
Inline Math Intrinsics with Olibcalls
  • Not all intrinsics are treated equal
  • abs is inlined at all optimisation levels
  • sqrt is inlined at O2 and above
  • Other math intrinsics like exp, log, pow, sin,
    are inlined at O3
  • Reciprocal square roots (y 1./sqrt(x))
  • IPF can compute rsqrt directly (no separate
    div/sqrt)
  • HP-UX comes with nonstandard rsqrt intrinsic
  • With Ofltaccrelaxed the f90 compiler matches
    and calls rsqrt at O2 but does inlining of rsqrt
    only at O3. Use it carefully !
  • Nice performance boost in quantum chemistry
    (Coulomb forces)

28
Important Linker Options
  • Flush denormalized values to zero
  • HP-UXLinking with FPD flushes denormalized
    values to zero
  • LinuxCompile the main routine with ftz and link
    normally
  • Archived libraries
  • HP-UX-Wl, -aarchive or Wl,-aarchive_shared to
    ensure archived libraries used as much as
    possible
  • Linux-static prevents linking with shared
    librariesld default corresponds to HP-UXs
    -shared_archive-Bstatic to use archived
    libraries-Bdynamic to use shared libraries

29
Compiler Flags for Parallelism
  • HP-UX
  • OopenmpEnable OpenMP directives. Available at
    any optimisation level.
  • Oparallel O3Enable HP/Convex directives and
    automatic parallelisation. Requires O3
  • Oparallel O3 OnoautoparDisable automatic
    parallelisation. Keep directive based
    parallelism.
  • Oparallel O3 OnodynselDisable dynamic loop
    selection.
  • Intel Linux
  • -openmpEnable OpenMP directives. Available at
    any optimisation level.
  • -parallelEnable automatic parallelisation.Op
    enMP is process based with Intel Linux while it
    is pthread based with HP-UXLinking on HP-UX
    involves libomp, libcps, libpthread

30
Environment Variables for Parallelism
  • HP-UX MP_NUMBER_OF_THREADSsets the number of
    threads with HP / CONVEX directives
  • HP-UX MP_IDLE_THREADS_WAIT set the of
    milliseconds a thread spins before suspending
    itself.If the number is less than 0, the threads
    will spin waitUseful to prevent context switches
    and thread migration
  • HP-UX MP_GANG ONOFFEnable / disable gang
    scheduling for multithreaded and MPI appsUseful
    for oversubscribed and throughput scenarios
  • HP-UX and Linux OMP_NUM_THREADSset OpenMP
    parallelism
  • HP-UX and Linux MLIB_NUMBER_OF_THREADSset MLIB
    shared memory parallelism

31
HP-UX CompilersDangerous and Useless Switches
  • Wrong floating point answers can be caused by
  • O3, O4, Ofltaccdefaultrelaxed,
    Onoparmsoverlap, FPD Use with caution and
    check your answers.
  • Useless switches, dont waste you time !
  • Ovectorize Matches specific loop patterns and
    replaces with optimized library calls. Usefull
    only for SPECfp.
  • Oaggressive, Oall Lots of aggressive
    optimisations including Ovectorize
  • fastallocatable was never observed to improve
    anything

32
Recommended Build Approach
  • Get reference timings/outputs from PA-RISC or
    whatever
  • Set the right architecture and data model
    switches
  • Start with O2 Odataprefetch Onolimit g
    Wl,pd,L
  • In case of wrong answers or divergence add
    OfltaccstrictIn case of right answers you can
    add FPD Ofltaccrelaxedand check answers again
  • For C/C try Onoparmsoverlap and check answers
  • Now make a profile with prospect or caliper
  • Try O3 and Oloop_block for selected hotspot
    routines
  • Start trying source changes
  • For C/C try profile based optimisation

33
HP-UX Debuggers and Profilers
  • see http//www.hp.com/go/wdb
  • http//www.hp.com/go/hpcaliper

34
WDB/GDB Debugger
  • has replaced all previous debuggers on HP-UX
    (xdb, dde) for both PA-RISC and IPF.
  • Choice of user interfaces
  • gdb for command line use
  • vdb runs gdb in a split terminal window like xdb
  • wdb is a Motif GUI on top of gdb
  • Location /opt/langtools/bin

35
HP WDB Features
  • Support for 32 bit and 64 bit data models and all
    languages
  • Debugs optimized code up to O2
  • Support for pthreads and consequently OpenMP
  • HW watchpoints
  • Memory checking (currently only on PA-RISC)
  • Array browsing
  • User definable buttons
  • Basic MPI supportAn extra wdb instance is
    started for every MPI process, so this is usable
    up to maybe 4 processes.read man mpidebug for
    MPI debugging instructions

36
Debug Preparation
  • On PA-RISC
  • Compile and link with gDebug information is in
    the executable
  • No support for 64 bit optimized programs. Compile
    at O0
  • 32 bit programs can be debugged at O1 (good) and
    O2 (limited)
  • Trouble with NFS mounted executables. Try to work
    on local disk.
  • On IPF
  • Compile and link with gDebug information stays
    in the object filesCompile with noobjdebug for
    debug info in the executable
  • Both 32 bit and 64 bit programs can be debugged
    at O1 (good) and O2 (limited).Note that
    register variables cant be viewed at O2

37
Profiling with CALIPER (1)
  • Highlights
  • Dynamic instrumentation requires no preparation
    for executables
  • All Itanium PMU counters are available for event
    counts
  • Supports multithreaded execution
  • Wish List
  • Support for MPI profiling and 3D charts for
    parallel profiles
  • Better source line mapping
  • Two modes of operation
  • Low intrusion hardware counters based
    measurements
  • Sample based/ clock driven profiling
  • Measure execution cycles, cache misses, branch
    mispredictions, etc.
  • Dynamic binary instrumentation for precise counts
  • e.g. function call graph, basic block arc counts
    accurately
  • event driven (Gprof or Cxperf type)

38
Profiling with CALIPER (2)
  • Usage/opt/caliper/bin/caliper config-file
  • Common config-files (many more available)
  • cgprof call graph profile (intrusive)
  • sample_ip sampling profile like PROSPECT (non
    intrusive)
  • pbo create flow.data for PBO (requires O1
    build)
  • dcache_miss cache metrics
  • Output options (many more available )
  • -o single file ASCII output
  • --html HTML output into directory

39
Profiling with CALIPER (3) measurements
  • type config_file comments
  • optimization pbo black box
  • total measurements total_cpu exact
    totals, no impact
  • sample measurements branch_prediction
    sampled details, low
  • dicache_miss impact
  • ditlb_miss
  • sample_cpuip
  • precise measurements arc_count exact
    details, high
  • func_count impact
  • func_cover
  • hybrid cgprof sampled exact, high
  • impact

40
Profiling with CALIPER (4) measurement strategy
  • if automatic optimization
  • if call graph profile
  • if correctness testing
  • determine what takes time
  • determine where event happens
  • determine when event happens
  • Pbo
  • cgprof
  • arc_count, func_count, func_cover
  • total_cpu w/ various events
  • branch_prediction,dicache_miss,
    ditlb_miss,sample_ip w/ select trigger
  • sample_cpu w/ select trigger

41
Profiling with CALIPER (5)
  • All PMU counters of ITANIUM can be used with
    CALIPER config files. See the list in
    /opt/caliper/doc/text/itanium2_cpu_counters
  • For counter description see the Intel Itanium-2
    Processor Reference Manual For Software
    Development and Optimisation
  • Measure total of Flops, Underflows and Clock
    Cycles
  • caliper total_cpu global-counters\
    FP_OPS_RETIRED, FP_FLUSH_TO_ZERO, CPU_CYCLES \
  • Measure of MFlops per subroutine
  • caliper sample_ip \ sampling-counterFP_O
    PS_RETIRED,,1000000 \

42
total_cpu example
  • caliper total_cpu crafty.O3
  • HP Caliper Total CPU Counts Report
  • Target Application
  • Program /home/daveb/crafty.O3
  • Invocation crafty.O3
  • Process ID 8859 (started by Caliper)
  • Start time 075639 AM
  • End time 075643 AM
  • Last modified June 01, 2002 at 0756 AM
  • Processor Information
  • Machine name longsp8
  • Number of processors 2
  • Processor type Itanium2
  • Processor speed 798 MHz
  • Report Help
  • Use the caliper option --info to append help to
    this report,
  • or see /opt/caliper/doc/text/total_cpu.help.
  • -----------------------------------------
  • Counter Priv. Mask Count
  • -----------------------------------------
  • INST_DISPERSED 8 (USER) 5029272413
  • NOPS_RETIRED 8 (USER) 690880642
  • CPU_CYCLES 8 (USER) 3140760527
  • -----------------------------------------
  • CPI
  • 0.6245 CPU_CYCLES / INST_DISPERSED
  • Useful CPI
  • 0.7239 CPU_CYCLES / (INST_DISPERSED -
    NOPS_RETIRED)

43
branch_prediction (html) example
44
More Profiling Tools ...
  • TUSC aka TRUSS is a system call tracer. Not
    supported but usable and robust. Get it here
  • ftp//ftp.cup.hp.com/dist/networking/tools
  • Visual Threads thread tracer was recently ported
    from Tru64 to HP-UX (IPF only).Currently in beta
    status. See
  • http//shvlhd.zko.cpqcorp.net/ThreadTools
  • GlancePlus, the official HP-UX system monitor
    with GUI offers detailed statistics and graphs on
    all relevant system resources like CPUs, Memory,
    Swap, Disks, Network, ProcessesInvocation
    /opt/perf/bin/gpm

45
Using HP MPI
46
Using HP MPI (1)Overview
  • Is part of HP-UX TCOE
  • Installs in /opt/mpi
  • Implements full MPI 1.2 and 90 of MPI 2 standard
    definition. Missing only dynamic process
    deletion.
  • Optimized for both SMP and cluster use
  • Supports hybrid OMP MPI programs by thread safe
    library
  • Supports up to 8 byte integers and 16 byte reals
    by i8 and autodbl switches
  • Integrated with LSF PAM
  • Supported with TotalView (currently PA-RISC only)

47
Using HP MPI (2)Build
  • Add /opt/mpi/bin to PATH
  • Make sure that HyperFabric2 fileset is installed
    in /opt/cliceven if no HF2 cards are there. This
    is not part of HP MPI.
  • Use convenience scripts mpif90 and mpicc for
    compiling and linking.No need to worry about
    include and library paths.
  • Linking for HMP is tricky. Use archive libraries.
    Suggestion -Wl,-a,archive_shared \
    lmpi_hmp lclic_csi

48
Using HP MPI (3)Run
  • Invocation on SMP systems
  • export MPI_FLAGS/opt/mpi/bin/mpirun np 64
  • Cluster execution uses remsh mechanism ?
    HOME/.rhosts requiredCreation of an appfile
    with 1 entry for each hostAll necessary env
    variables must be distributed through the appfile
  • -h rx01 np 2 e MPI_HMPON e MPI_FLAGS
    -h rx02 np 2 e MPI_HMPON e
    MPI_FLAGS -h rx03 np 2 e
    MPI_HMPON e MPI_FLAGS -h
    rx..
  • Invocation from master node
  • /opt/mpi/bin/mpirun f

49
Using HP MPI (4)Environment variables
  • Yield a waiting CPU after 1s of spinwaiting.
    Useful on SMP systems to avoid context switches
    and process migration if not oversubscribed.
  • export MPI_FLAGSy1000
  • Set gang scheduling (MPI OMP), useful on
    oversubscribed SMPs
  • export MP_GANGON
  • Run with the chosen debugger (see man mpidebug
    for details)
  • export MPI_FLAGSewdb
  • Choose HMP instead of TCP/IP if HF2 is available.
    HMP will identify the HF2 interface even if it is
    not corresponding to the hostname.
  • export MPI_HMPON

50
Using HP MPI (5)Lightweight instrumentation
  • Lightweight instrumentation which will dump
    messaging statistics in a .instr text file at
    termination.
  • -e MPI_INSTR
  • mpirun i
  • Statistics include
  • User time, MPI overhead, blocking time, system
    time for each rank
  • Number of calls for each MPI call for each rank
  • Message sizes for each pair of ranks
  • Despite the missing GUI these statistics are very
    useful to determine load imbalance and total MPI
    overhead. Intrusion is far below 10. No
    relinking required.

51
Using HP MPI (6)Profiling with CALIPER on a
cluster
  • Step 1 Create a little command script
  • !/bin/sh/opt/caliper/bin/caliper sample_ip
    \-html.(hostname).
  • Step 2 Invoke the command script instead of the
    program from appfile
  • -h rx01 np 2 e MPI_HMPON e MPI_FLAGS
    -h rx02 np 2 e MPI_HMPON e
    MPI_FLAGS -h rx03 np 2 e
    MPI_HMPON e MPI_FLAGS -h rx..
  • Step 3 Activate lightweight instrumentation and
    run it
  • /opt/mpi/bin/mpirun i f
  • MPI will dump its statistics and CALIPER will
    create a html report directory for each rank.
  • This method works on clusters as well as on SMPs

52
Using HP MPI (7)PALLAS MPI-1 key results on
RX2600
() Due to protocol complexity HMP has higher
latency than MYRICOM GM (kernel bypass might improve HMP latency to
15?sec but is not commited as Infiniband is
already on the horizon to replace HyperFabric2.
53
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com