Analyzing OpenCL™ Kernel Performance
To analyze OpenCL™ kernel performance with the Intel® SDK for OpenCL™
Applications standalone version, do the following:
- Click the Analyze
button.
- Click Refresh kernel(s) to get the list of kernels in the
currently open *.cl file.
- Select the target kernel from pull-down menu. If only one kernel
is available, it is selected by default.
- Click cells in the Assigned Variables column to create or
add variables as kernel arguments. You can assign one-dimensional
variables (such as integer, float,
char, half,
and so on) on-the-fly by typing single values into the table. See
section Creating Variables for
details.
- Set number of iterations, global size and local sizes per workload
dimension in the Workgroup size definitions group box.
- Click Analyze to wrap a specific kernel and execute analyses.
You can use the local size(s) text boxes for several different
test configurations:
- Set single size value for a single test.
- Add several comma-separated sizes for multiple tests.
- Set 0 to utilize the default framework-assigned local size.
- Click Auto to enable the tool iterate on all sizes that
are smaller than global size and device maximum local size.
Also consider the following:
- Using each option is available for each dimension.
- To analyze the kernel in its designed conditions, set a single
value.
- To find the local size that provides higher performance results,
click Auto or set a list of comma-separated values.
- To improve the analysis accuracy, run each global and local work
size combination several times by increasing the Number of iterations
value. Several iterations minimize the impact of other system processes
or tasks on the kernel execution time.
- Use the Device Information
dialog
to compare device properties and choose the appropriate device for
the kernel.
See Also
Creating Variables