Intel® Math Kernel Library 2018 Update 1 Developer Guide
The Intel Distribution for LINPACK Benchmark reacts to MKL_MIC_ENABLE and OFFLOAD_DEVICES environment variables for automatic offload to Intel Xeon Phi coprocessors. They also inform you how many Intel Xeon Phi coprocessors are detected during the run. The top of the output has a line like this:
Number of Intel® Xeon Phi™ coprocessors: 1
If Intel Xeon Phi coprocessors are available on your cluster and you expect offloading to occur, but the number printed is zero, it is likely that the correct compiler environment was not loaded. Specifically, check whether the LD_LIBRARY_PATH environment variable contains shared libraries libcoi_host.so.0 and libscif.so.0, which are installed by the Intel® Manycore Platform Software Stack (Intel® MPSS).
You can use environment variables to adjust the behavior of your runs. For a list of supported environment variables, see Environment Variables. The variable NUMMIC refers to the number of Intel Xeon Phi coprocessors per cluster node. You can use HPL_MIC_DEVICE and HPL_MIC_SHAREMODE environment variables to share the Intel Xeon Phi coprocessors among MPI processes. The scriptsrunme_intel64_dynamic and runme_intel64_static set these environment variables for you for a given number of MPI ranks per node.
Adjust the block size NB and problem size N parameters as appropriate. The table below shows recommended values of NB for different numbers of Intel Xeon Phi coprocessors per node. The values may vary and depend on the PCI Express settings and performance of main memory.
1 coprocessor |
2 coprocessors |
---|---|
896 |
1280 |
Large values of NB require extra memory on the host processor and coprocessor. If this memory is low, the problem size N does not satisfy the inequality N > > NB. In that event, it is better not to use the Intel Distribution for LINPACK Benchmark in an offload mode.
Optimization Notice |
---|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |