Performance-Related Configuration Variables on CPU

The following variables enable you to configure aspects of the OIntel® CPU Runtime for OpenCL™ Applications.

The default configuration offers the most stable performance.

Variable Value Type Default Value Description
CL_CONFIG_USE_FAST_RELAXED_MATH Boolean False If set to True, building kernels occurs with -cl-fast-relaxed-math build option, which enables optimizations for floating-point arithmetic that may violate the IEEE 754 standard and the OpenCL™ numerical compliance requirements.
CL_CONFIG_CPU_RT_LOOP_UNROLL_FACTOR Integer 2 Defines a loop unrolling factor for loops with non-constant trip-count.
Allowed values: [1,16]. Out of bounds values are clamped.
1: disabled.
Example values: 1, 2, 3, 16.
CL_CONFIG_USE_VECTORIZER Boolean   Affects the behavior of the vectorizer of the entire system (or shell instances) until variable gets unset explicitly (or shell(s) terminates).
CL_CONFIG_CPU_VECTORIZER_MODE Integer 0 Sets the vectorization “width” (when CL_CONFIG_USE_VECTORIZER = True).
Allowed values: 0,1,4,8,16
0: the compiler makes heuristic decisions whether to vectorize each kernel, and if so, which vector width to use.
1: no vectorization is done by compiler. Explicit vector data types in kernels are left intact (the same as CL_CONFIG_USE_VECTORIZER = False).
CL_CONFIG_CPU_TARGET_ARCH String Autodetect Generates code exclusively for a given target CPU architecture. Allows only lowering the instruction set level supported by CPU.
Allowed values:
  • skx - Generates code for processors that support Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Foundation instructions, Intel® AVX-512 Conflict Detection instructions, Intel® AVX-512 Doubleword and Quadword instructions, Intel® AVX-512 Byte and Word instructions and Intel® AVX-512 Vector Length Extensions for Intel® processors, and the instructions enabled with core-avx2.
  • core-avx2 - Generates code for processors that support Intel® Advanced Vector Extensions 2 (Intel® AVX2), Intel® AVX, SSE4.2 SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions.
  • corei7-avx - Generates code for processors that support Intel® Advanced Vector Extensions (Intel® AVX), Intel® SSE4.2, SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions.
  • corei7 - Generates code for processors that support Intel® SSE4.2 Efficient Accelerated String and Text Processing instructions. May also generate code for Intel® SSE4 Vectorizing Compiler and Media Accelerator, Intel® SSE3, SSE2, SSE, and SSSE3 instructions.