Allocate a dedicated thread on the host CPU for scheduling commands per each Intel® Xeon Phi™ coprocessor. Basically, runtime does this for you, but to be able to use the CPU OpenCL™ device efficiently (in the same context with coprocessors or a separate context), use the device fission feature. Refer to the Developer Guide for Intel® SDK for OpenCL™ Applications for more information on the device fission extension feature for CPU OpenCL device.
Also consider experimenting, as various trade-offs are possible.