- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
NEC Cluster FAQ (laki + laki2)
System usage
What ist the minimal command to submit a job to the batch system?
- Specify the number of nodes, the number of processor cores per node, the type of the nodes (probably 'nehalem') and the desired walltime.
Example: 4 nodes, 8 processor cores per node for two hours
qsub -l nodes=4:ppn=8:nehalem,walltime=2:00:00 ./myscript
How can I filter the queue to only show my jobs?
- The commands showq and qsub are used to display job information. If there are many jobs in the queue it is more convenient to filter the data.
showq -u <user-name>
qstat -u <user-name>
Errors
- mpirun: spawn failed with errno=-11
- using
mpirun -np 2 -hostfile $PBS_NODEFILE ./test.outcauses in combination with the openmpi module an error like
[n110402:02618] pls:tm: failed to poll for a spawned proc, return status = 17002 [n110402:02618] [0,0,0] ORTE_ERROR_LOG: In errno in file ../../../../../orte/mca/rmgr/urm/rmgr_urm.c at line 462 [n110402:02618] mpirun: spawn failed with errno=-11You simply have to omit the -hostfile option.
Software Development
Where can I find documentation for compilers ?
- Intel Fortran
- Intel C++
- GCC - Gnu Compiler Collection
Documentation for MPI libraries
Documentation for numerical libraries
Documentation for the batch system
- The most important commands are qsub, qdel, qstat.
- The command showstart is quite interesting for the impatient user.
Support
Development Support Question
I cannot build my application, it is not running, or I did nothing and it does not work anymore. Please help me.
If you need support to get your application working it is necessary to provide some information to get useful help.
If you have problems to build your application please provide the following information.
- Used software modules (module list),
- The calls to the tools like compiler or linker,
- The output from these tools.
If you have problems during the excution of your application please provide the following information.
- The command which you have used to submit your job to the execution queue (qsub ...),
- Used software modules (module list),
- The path where you have executed your program, command or script to execute the program, the information for input files.
In cases similar to that that you did not use your application for some time, made a recompilation (but did not change anything else) and run into problems, please recompile your application with a command like
make ... 2>&1 | tee make.log
Please check the output carefully for warnings which could potentially be hints for problems. If this does not help, bundle this log together with the information mentioned above in your support question.