Application-Level Optimizations
Avoid Needless Synchronization
Reuse Compilation Results with clCreateProgramWithBinary