Intel® VTune™ Amplifier

Input and Output Analysis

Use a platform-wide Input and Output analysis to monitor utilization of the disk subsystem, CPU and processor buses.

Note

This is a PREVIEW FEATURE on Windows* OS. A preview feature may or may not appear in a future production release. It is available for your use in the hopes that you will provide feedback on its usefulness and help determine its future. Data collected with a preview feature is not guaranteed to be backward compatible with future releases. Please send your feedback to parallel.studio.support@intel.com or to intelsystemstudio@intel.com.

This analysis type uses the hardware event-based sampling collection and system-wide Ftrace* collection (for Linux* and Android* targets)/ETW collection (Windows* targets) to provide a consistent view of the storage sub-system combined with hardware events and an easy-to-use method to match user-level source code with I/O packets executed by the hardware.

Disk Input and Output Analysis

The analysis actively relies on the data produced by the kernel block driver system. In case your platform utilizes a non-standard block driver sub-system (for example, user-space storage drivers), IO metrics will not be available in the analysis type.

The Input and Output analysis helps identify:

To run the Input/Output analysis, explore:

I/O Analysis Metrics

VTune Amplifier uses the following system-wide metrics for the I/O analysis:

Configuration Options

Prerequisites:

To run the Input and Output analysis:

  1. Click the (standalone GUI)/ (Visual Studio IDE) New Analysis button on the Intel® VTune™ Amplifier toolbar.

    The New Amplifier Result tab opens with the Analysis Type window active.

  2. From the analysis tree on the left pane, select Platform Analysis > Input and Output.

    The analysis configuration pane opens on the right.

  3. Depending on you target app and analysis purpose, choose any of the following configuration options:

    Select IO API type to profile

    By default, the VTune Amplifier profiles System Disk IO API.

    For DPDK applications, select DPDK IO API.

    For SPDK applications, select SPDK IO API.

    Analyze memory bandwidth check box

    Collect the data required to compute memory bandwidth.

    The option is enabled by default.

    Evaluate max DRAM bandwidth check box

    Evaluate maximum achievable local DRAM bandwidth before the collection starts. This data is used to scale bandwidth metrics on the timeline and calculate thresholds.

    The option is enabled by default.

  4. Click Start to run the analysis.

To run the Input and Output analysis from the command line, enter:

$ amplxe-cl -collect io -- <target> [target_options]

Input and Output Viewpoint

VTune Amplifier collects the data, generates a rxxxio result, and opens it in the default Input and Output viewpoint that displays statistics on I/O waits (Linux targets only), I/O operations and I/O data transfers distributed over time and correlated with the data on the application execution, and other metrics depending on the selected profiling type. For Disk IO analysis, start with the Disk Input and Output Histogram section of the Summary window. Identify slow I/O operations and switch to the grid view for further analysis.

What's Next

If you identified imbalance between I/O and compute operations, consider modifying your code to make I/O operations asynchronous.

For I/O requests with long latency, check whether your data can be pre-loaded, written incrementally, or consider upgrading your storage device (to SSD, for example).

See Also