Getting Credible Performance Numbers

Performance measurements are done on a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, the minimum (or average, geometric mean, and so on) value for the execution time is usually used for final projections.

An alternative to calling kernel several times is using a single “warm-up” run.

The warm-up run might be helpful for kernels with small amount of computations, as it helps to amortize the following potential (one-time) costs:

NOTE: You need to make your performance conclusions on reproducible data. If warm-up run does not help or execution time still varies, try running large number of iterations and then average the results. For time values that range too much, consider using geomean.

Consider the following:

Refer to the “OpenCL™ Optimizations Tutorial”  SDK sample for code examples of performing warm-up run before starting performance measurement.

See Also

Simple Optimizations of OpenCL™ Code