Intel® Advisor Help
Running your target application with the Intel Advisor can take substantially longer than running your target application without the Intel Advisor. For example:
Runtime Overhead / Analysis |
Survey |
Trip Counts & FLOP |
Roofline |
Dependencies |
MAP |
---|---|---|---|---|---|
Target application runtime with Intel Advisor compared to runtime without Intel Advisor |
1.1x longer |
3 - 8x longer |
3.1 - 8.1x longer |
5 - 100x longer |
5 - 20x longer |
The following techniques may help minimize overhead without limiting collection scope.
Minimization Technique |
Impacted Intel Advisor Analyses |
Summary |
---|---|---|
|
GUI controls:
CLI action option: -no-enable-cache-simulation |
|
Memory Access Patterns |
GUI controls:
CLI action options:
|
|
All, but especially
|
Minimize number of instructions executed within a loop while thoroughly exercising target application control flow paths. |
|
|
GUI control: Vectorization Workflow pane > Cancel current analysis control during finalization CLI action option: -no-auto-finalize |
Minimize collection overhead.
Applicable analyses:
Memory Access Patterns (base simulation functionality)
Trip Counts and FLOP (enhanced simulation functionality that also requires setting the ADVIXE-EXPERIMENTAL=int_roofline environment variable)
Implement these techniques when cache modeling information is not important to you:
The default setting for all the properties/options in the table below is disabled.
Path: Project Properties > Analysis Target... |
CLI Action Options |
Description |
---|---|---|
Disable the Memory Access Patterns Analysis > Advanced > Enable cache simulation checkbox. |
-no-enable-cache-simulation | Do not model cache misses, cache misses and cache line utilization, or cache misses and loop footprint. |
Disable the Trip Counts and FLOP Analysis > Advanced > Enable cache simulation checkbox. |
-no-enable-cache-simulation | Do not:
|
Applicable analysis: Memory Access Patterns.
Implement these techniques when the additional data is not important to you.
The default setting for all the properties/options in the table below is enabled.
Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced |
CLI Action Options |
Description |
---|---|---|
Disable the Report stack variables checkbox. |
-no-record-stack-frame | Do not report stack variables for which memory access strides are detected. |
Disable the Report heap allocated variables checkbox. |
-no-record-mem-allocations | Do not report heap-allocated variables for which memory access strides are detected. |
Minimize collection overhead.
Applicable analyses: All, but especially Dependencies, Memory Access Patterns.
When you run an analysis, the Intel Advisor executes the target against the supplied data set. Data set size and workload have a direct impact on target application execution time and analysis speed
For example, it takes longer to process a 1000x1000 pixel image than a 100x100 pixel image. A possible reason: You may have loops with an iteration space of 1...1000 for the larger image, but only 1...100 for the smaller image. The exact same code paths may be executed in both cases. The difference is the number of times these code paths are repeated.
You can control analysis cost without sacrificing completeness by minimizing this kind of unnecessary repetition from target application execution.
Instead of choosing large, repetitive data sets, choose small, representative data sets that minimize the number of instructions executed within a loop while thoroughly exercising target application control flow paths.
Your objective: In as short a runtime period as possible, execute as many paths as you can afford, while minimizing the repetitive computation within each task to the bare minimum needed for good code coverage.
Data sets that run in about ten seconds or less are ideal. You can always create additional data sets to ensure all your code is checked.
Minimize finalization overhead.
Applicable analyses: Roofline, Survey, Trip Counts and FLOP.
Use when you plan to view collected analysis data on a different machine. This is particularly useful if you are collecting analysis data on an Intel® Xeon Phi™ machine and plan to view the result on another machine. Finalization automatically occurs when a result is opened in the GUI or a report is generated from the result.
To implement, do one of the following while running an analysis:
When the analysis
Finalizing data... phase begins, click the associated
Cancel button.
Use the CLI action option -no-auto-finalize when you run the desired analysis. For example:
advixe-cl -collect survey -project-dir ./myAdvisorProj -no-auto-finalize -- ./bin/myTargetApplication