- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -

Difference between revisions of "Vampir"

From HLRS Platforms
Jump to navigationJump to search
(Update VapirServer documentation with example for HAWK.)
Line 24: Line 24:
  
 
=== VampirServer ===
 
=== VampirServer ===
''See next section for instructions on using VampirServer on Hazelhen.''
 
  
For large-scale traces (> 500MB and up to many thousand MPI processes), use the parallel VampirServer backend (on compute nodes allocated through the queuing system), and attach to it using vampir:
+
For large-scale traces (> 10GB and up to many thousand MPI processes), use the parallel VampirServer backend (on compute nodes allocated through the queuing system), and attach to it using vampir.
{{Command|command=qsub -I -lnodes=16:nehalem:ppn=8,walltime=1:0:0
+
Most likely you will select the number of nodes based on the trace file size and the memory available on a single node. However, you may need more than two times the memory to hold the trace.
module load performance/vampirserver
+
 
vampirserver start -n $((16*8 - 1))
+
To start the server get an interactive node first. To run the vampirserver for 1 hour on 4 nodes of HAWK with 512 processes:
 +
 
 +
{{Command|command=qsub -I -lselect=4:mpiprocs=128,walltime=1:0:0
 +
module load vampirserver  # on vulcan use module load performance/vampirserver instead
 +
vampirserver start -n $((512 - 1))
 
}}
 
}}
 
{{Warning|text=The number of analysis processes must not exceed the number of processes requested minus one!}}
 
{{Warning|text=The number of analysis processes must not exceed the number of processes requested minus one!}}
 
This will show you a connection host and port:  
 
This will show you a connection host and port:  
 
<pre>
 
<pre>
VampirServer 7.5.0
+
VampirServer 9.8.0
 
Licensed to HLRS
 
Licensed to HLRS
Running 255 analysis processes.
+
Running 511 analysis processes. (abort with vampirserver stop 66)
Server listens on: n010802:30000
+
Server listens on: r15c1t7n1:30000
 
</pre>
 
</pre>
  
Open a new shell window, login to one of the login nodes of the system and open vampir.
+
From this output note down the server name and port as well as the command to stop the vampirserver.
{{Command|command=module load performance/vampir
+
 
 +
Now open a new shell, and login to one of the login nodes of the system via ssh (don't forget X-forwarding) and open vampir.
 +
{{Command|command=module load vampir # on vulcan module load performance/vampir
 
vampir
 
vampir
 
}}
 
}}
Now use the "Remote open" button and enter the host and port displayed by VampirServer. Proceed and select the trace you want to open.
+
Select open other and then chose "Remote File". In the opening window enter the server name and port displayed by VampirServer before. Proceed and select the trace you want to open.
  
 
[[Image:vampir_remote_open.png|Example of remote open on Nehalem-Cluster]]
 
[[Image:vampir_remote_open.png|Example of remote open on Nehalem-Cluster]]
 
 
==== VampirServer on Hazelhen ====
 
 
Request an interactive session
 
{{Command|command=qsub -I -lnodes=4:ppn=24 -lwalltime=2:00:00}}
 
Load the vampirserver module
 
{{Command|command=module load performance/vampirserver}}
 
Start the vampirserver on the compute nodes using
 
{{Command|command=vampirserver start -n $(($PBS_NP - 1))}}
 
This will show you a connection host and port which you have to remember for later:
 
<pre>Launching VampirServer...
 
VampirServer 8.1.0 (r8451)
 
Licensed to HLRS
 
Running 95 analysis processes... (abort with vampirserver stop 61)
 
VampirServer <61> listens on: nid07625:30000
 
</pre>
 
{{Warning|text=The number of analysis processes must not exceed the number of processes requested minus one!}}
 
Open a new terminal with enabled X forwarding (ssh -X ...) and look up on which login node you are, using:
 
{{Command|command=hostname}}
 
which will display e.g.
 
<pre>eslogin005
 
</pre>
 
Now go back to the terminal running the interactive session and forward the vampirserver to the login node using the connection info from the vampirserver output before:
 
{{Command|command=ssh -N -R 30000:nid07625:30000 eslogin005}}
 
Now you can proceed opening vampir on the login node and connect to localhost at the forwarded port number in vampir using ''Open Other...-->Remote File''.
 
  
 
== See also ==
 
== See also ==

Revision as of 13:47, 30 April 2020

The Vampir suite of tools offers scalable event analysis through a nice GUI which enables a fast and interactive rendering of very complex performance data. The suite consists of Vampirtrace, Vampir and Vampirserver. Ultra large data volumes can be analyzed with a parallel version of Vampirserver, loading and analysing the data on the compute nodes with the GUI of Vampir attaching to it.

Vampir is based on standard QT and works on desktop Unix workstations as well as on parallel production systems. The program is available for nearly all platforms like Linux-based PCs and Clusters, IBM, SGI, SUN. NEC, HP, and Apple.

Vampir-logo.gif
Developer: GWT-TUD GmbH
Platforms:
Category: Performance Analyzer
License: Commercial
Website: Vampir homepage


Usage

Vampir consists of a GUI interface and a analysis backend. In order to use Vampir, You first need to generate a trace of Your application, preferably using VampirTrace. The Open Trace Format (OTF) trace consists of a file for each MPI process (*.events.z) a trace definition file (*.def.z) and the master trace file (*.otf) describing the other files. Fore details how to generate OTF traces see Vampirtrace.

Vampir

To analyze small traces (< 500 MB of trace data), you can use Vampir standalone with the default backend:

module load performance/vampir
vampir


VampirServer

For large-scale traces (> 10GB and up to many thousand MPI processes), use the parallel VampirServer backend (on compute nodes allocated through the queuing system), and attach to it using vampir. Most likely you will select the number of nodes based on the trace file size and the memory available on a single node. However, you may need more than two times the memory to hold the trace.

To start the server get an interactive node first. To run the vampirserver for 1 hour on 4 nodes of HAWK with 512 processes:

qsub -I -lselect=4:mpiprocs=128,walltime=1:0:0

module load vampirserver # on vulcan use module load performance/vampirserver instead

vampirserver start -n $((512 - 1))
Warning: The number of analysis processes must not exceed the number of processes requested minus one!

This will show you a connection host and port:

VampirServer 9.8.0
Licensed to HLRS
Running 511 analysis processes. (abort with vampirserver stop 66)
Server listens on: r15c1t7n1:30000

From this output note down the server name and port as well as the command to stop the vampirserver.

Now open a new shell, and login to one of the login nodes of the system via ssh (don't forget X-forwarding) and open vampir.

module load vampir # on vulcan module load performance/vampir vampir

Select open other and then chose "Remote File". In the opening window enter the server name and port displayed by VampirServer before. Proceed and select the trace you want to open.

Example of remote open on Nehalem-Cluster

See also

External links