Immediate Command Execution extension enables you to execute OpenCL™ commands in a single-threaded manner using the calling thread to perform the actual execution.
To use this extension, add token CL_QUEUE_THREAD_LOCAL_EXEC_ENABLE_INTEL
to the queue properties when executing clCreateCommandQueue
.
clEnqueueXXX
calls to that queue are synchronous – they return
only after the queued command finishes executing. Only the thread calling
clEnqueueXXX
executes those commands, which also includes
calls to clEnqueueNDRange
.
The extension tokens are defined in the cl_ext.h
file,
which is provided with Intel® SDK for OpenCL™ Applications (https://software.intel.com/en-us/intel-opencl)
.
Using this extension, you can create a command queue alongside the rest of the queues and use it to execute lightweight kernels or NDRanges with a high granularity (small global size) that cannot gain much from the Intel® multi-core architecture. You will still get the full benefits of the compiler, including the automatic vectorization module.
An Immediate Command Execution queue can be in-order or out-of-order.
In the in-order mode, if multiple threads are added to the same queue
at the same time, they block each other to comply with the OpenCL in-order
queue semantics. Therefore, you should use the combination of CL_QUEUE_THREAD_LOCAL_EXEC_ENABLE_INTEL
and CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
.