This variable generates code exclusively for a given target CPU architecture.
CL_CONFIG_CPU_TARGET_ARCH
allows only lowering the
instruction set level supported by CPU.
By default, it is set to Autodetect
.
Allowed values are:
CL_CONFIG_CPU_TARGET_ARCH = skx
. Generates code
for processors that support Intel® Advanced Vector Extensions 512
(Intel® AVX-512) Foundation instructions, Intel® AVX-512 Conflict
Detection instructions, Intel® AVX-512 Doubleword and Quadword instructions,
Intel® AVX-512 Byte and Word instructions and Intel® AVX-512 Vector
Length Extensions for Intel® processors, and the instructions enabled
with core-avx2
.CL_CONFIG_CPU_TARGET_ARCH = core-avx2
. Generates
code for processors that support Intel® Advanced Vector Extensions
2 (Intel® AVX2), Intel® AVX, SSE4.2 SSE4.1, SSE3, SSE2, SSE, and SSSE3
instructions.CL_CONFIG_CPU_TARGET_ARCH = corei7-avx
. Generates
code for processors that support Intel® Advanced Vector Extensions
(Intel® AVX), Intel® SSE4.2, SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions.CL_CONFIG_CPU_TARGET_ARCH = corei7
. Generates
code for processors that support Intel® SSE4.2 Efficient Accelerated
String and Text Processing instructions. May also generate code for
Intel® SSE4 Vectorizing Compiler and Media Accelerator, Intel® SSE3,
SSE2, SSE, and SSSE3 instructions.Some kernels are not possible to be vectorized, so vectorizer would not touch them regardless of the mode. Also be careful with manual overriding the compiler heuristic, build process would fail if target hardware does not support the specific vectorization width. Inspect the compiler output in the offline compiler tool (described in the Developer Guide) on the messages related to vectorization.