Intel® VTune™ Amplifier
Use the graphics-rendering (preview) to estimate your code performance based on the GPU usage per engine and GPU hardware metrics.
It focuses on the following usage models:
System-wide profiling on all virtual domains (Dom0, DomUs) running under the Xen* hypervisor to identify domains that take too many resources and introduce a bottleneck for the whole platform. Use the -target-system option to specify a remote machine connected to your host via SSH.
Profiling of OpenGL-ES applications running on Linux* systems to detect performance-critical API calls. For this mode, specify the application to analyze or a process to attach to, using the -target-process or -target-pid options.
This analysis type is available on the processors based on Intel® microarchitecture code name Broadwell and later.
The GPU In-kernel Profiling instruments your code and, depending on your configuration settings, helps identify performance-critical basic blocks or issues caused by memory accesses in the GPU kernels.
Since the GPU In-kernel Profiling incurs higher performance overhead than the GPU Hotspots analysis, you may consider first running the GPU Hotspots analysis to identify the hottest GPU computing task (GPU kernel) and then exploring this kernel with the GPU In-kernel Profiling.
GPU In-kernel profiling introduces the following key metrics:
Estimated GPU Cycles: The average number of GPU cycles per one kernel instance.
GPU Instructions Executed per Instance: The average number of GPU instructions executed per one kernel instance.
GPU Instructions Executed per Thread: The average number of GPU instructions executed by one thread per one kernel instance.
Syntax:
$ amplxe-cl [--target-system=ssh:username@hostname[:port]]--collect graphics-rendering [--knob <knobName=knobValue>] -- [target] [target_options]
Knobs: gpu-sampling-interval, gpu-counters-mode=render-basic.
For the most current information on available knobs (configuration options) for the Graphics Rendering, enter:
$ amplxe-cl -help collect graphics-rendering
Example:
This example runs system-wide Graphics Rendering analysis for a remote Xen target:
host>./amplxe-cl --target-system=ssh:user1@172.16.254.1 –-collect graphics-rendering --duration 0
This example profiles an OpenGL-ES app running the Graphics Rendering analysis:
host>./amplxe-cl –-collect graphics-rendering --target-process process1
When the data collection is complete, do one of the following to view the result:
Use the -report action to view the data from command line.
Use the -report-output action to write report to a .txt or .csv file
Open the data collection result (*.amplxe) in the VTune Amplifier graphical interface.