- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
VTune: Difference between revisions
No edit summary |
No edit summary |
||
Line 20: | Line 20: | ||
=== Using Intel VTune === | === Using Intel VTune === | ||
To perform the performance analyse of your application with VTune you don’t need special compiler wrapper or libraries. Just recompile and relink your code with extra –g option in order to include debug information. VTune works well for dynamically linked binaries. [https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/set-up-analysis-target/linux-targets/analyzing-statically-linked-binaries-on-linux-targets.html Here] you can find some tips for statically linked binaries | To perform the performance analyse of your application with VTune you don’t need special compiler wrapper or libraries. Just recompile and relink your code with extra –g option in order to include debug information. VTune works well for dynamically linked binaries. [https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/set-up-analysis-target/linux-targets/analyzing-statically-linked-binaries-on-linux-targets.html Here] you can find some tips for statically linked binaries. | ||
Example: | Example: | ||
<pre> | <pre> | ||
module load | module load vtune # set up VTune environment | ||
module load gcc mpt | |||
module load | |||
</pre> | </pre> | ||
Compilation example: | Compilation example: | ||
<pre> | <pre> | ||
mpicxx –O2 -g -Wl,-Bdynamic main.cpp | |||
</pre> | </pre> | ||
Run analysis: | |||
VTune has both a GUI and command line tool: vtune-gui and vtune. | |||
The following types of analysis are available on Hawk: | |||
* ''hotspots'' - Analyze application flow and identify sections of code that take a long time to execute (hotspots). | |||
* ''threading'' - Discover how well your application is using parallelism to take advantage of all available CPUs. Identify and locate synchronization issues causing overhead or idle wait time resulting in lost performance. | |||
* ''memory-consumption'' - Analyze memory consumption by your Linux application, its distinct memory objects and their allocation stacks. | |||
'''NOTE:''' ''The VTune project working directory and the results directory must be placed on lustre FS.'' | |||
Example: | |||
<pre> | <pre> | ||
module load vtune | |||
module load gcc mpt | |||
WORKDIR=/your/project/dir/on/lustre | |||
cd ${WORKDIR} | |||
mpirun -np 128 vtune -collect hotspots -r ${WORKDIR}/results_dir -- ./a.out your_input.file | |||
</pre> | </pre> | ||
Report: | |||
You can also generate the report in text form using the VTune command line tool: | |||
<pre> | <pre> | ||
vtune -help report | |||
vtune -report ''summary'' -r ${WORKDIR}/results_dir | |||
</pre> | </pre> | ||
Or you can open the results of analysis in vtune-gui tool. | |||
For some use cases you might need to limit the amount of raw data to be collected. Define this limit in MB through the data-limit option: | |||
<pre> | <pre> | ||
mpirun -np 128 vtune -collect hotspots -data-limit=200 -- ./a.out | |||
</pre> | </pre> | ||
Some more information about VTune you can find here. | |||
== See also == | == See also == | ||
Line 60: | Line 69: | ||
== External links == | == External links == | ||
* [ | * [https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html Intel® VTune™ Profiler homepage] | ||
* [https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/introduction.html Introduction to Intel® VTune™ Profiler] | |||
[[Category:Performance Analyzer]] | [[Category:Performance Analyzer]] |
Revision as of 16:25, 4 June 2021
Intel® VTune™ Profiler is a performance analysis tool for serial and multithreaded applications. Use VTune Profiler:
|
|
Using Intel VTune
To perform the performance analyse of your application with VTune you don’t need special compiler wrapper or libraries. Just recompile and relink your code with extra –g option in order to include debug information. VTune works well for dynamically linked binaries. Here you can find some tips for statically linked binaries.
Example:
module load vtune # set up VTune environment module load gcc mpt
Compilation example:
mpicxx –O2 -g -Wl,-Bdynamic main.cpp
Run analysis: VTune has both a GUI and command line tool: vtune-gui and vtune. The following types of analysis are available on Hawk:
- hotspots - Analyze application flow and identify sections of code that take a long time to execute (hotspots).
- threading - Discover how well your application is using parallelism to take advantage of all available CPUs. Identify and locate synchronization issues causing overhead or idle wait time resulting in lost performance.
- memory-consumption - Analyze memory consumption by your Linux application, its distinct memory objects and their allocation stacks.
NOTE: The VTune project working directory and the results directory must be placed on lustre FS. Example:
module load vtune module load gcc mpt WORKDIR=/your/project/dir/on/lustre cd ${WORKDIR} mpirun -np 128 vtune -collect hotspots -r ${WORKDIR}/results_dir -- ./a.out your_input.file
Report: You can also generate the report in text form using the VTune command line tool:
vtune -help report vtune -report ''summary'' -r ${WORKDIR}/results_dir
Or you can open the results of analysis in vtune-gui tool.
For some use cases you might need to limit the amount of raw data to be collected. Define this limit in MB through the data-limit option:
mpirun -np 128 vtune -collect hotspots -data-limit=200 -- ./a.out
Some more information about VTune you can find here.