For analysis of the kernel side of the application, the following reports are generated:
The Occupancy report shows, for each kernel, the occupancy of each execution unit in the GPU.
You can also see the number of GPU threads launched, and the min, max and average thread execution time.
The Ticks per Thread report shows, for each number of active
threads, the amount of time this number of threads was active.
The Threads per Time report shows this number of threads that were active at each point in time during the execution.
The Latency pane shows, for each kernel file, the overall latency
of the memory commands.
Click the kernel name to see the latency of each memory command in the
source code of this kernel.