- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Open MPI

From HLRS Platforms
Revision as of 14:15, 26 February 2010 by Hpcchris (talk | contribs) (Added usage examples)
Jump to navigationJump to search
Open MPI is an Message Passing Interface (MPI) library project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI).
Developer: Open MPI Development Team
Platforms: NEC Nehalem Cluster
Category: MPI
License: New BSD license
Website: Open MPI homepage


Examples

simple example

This example shows the basic steps when using Open MPI.

Load the necessary module

module load mpi/openmpi


Compile your application using the mpi wrapper compilers mpicc, mpic++ or mpif90:

mpicc your_app.c - o your_app


Now we run our application using 128 processes spread accros 16 nodes in an interactive job (-I option):

qsub -l nodes=16:ppn=8,walltime=6:00:00 -I # get 16 nodes for 6 hours mpirun -np 128 your_app # run your_app using 128 processes


specifying the number of processes per node

Open MPI divides resources in something called 'slots'. By specifying ppn:X to the batchsystem, the number of slots per node is specified. So for a simple MPI job with 8 process per node (=1 process per core) ppn:8 is best choice, as in above example. Details can be specified on mpirun command line. PBS setup is adjusted for ppn:8, please do not use other values.

If you want to use less processes per node e.g. because you are restricted by memory requirements, or you have a hybrid parallel application using MPI and OpenMP, MPI would always put the first 8 processes on the first node, second 8 on second and so on. To avoid this, you can use the -npernode option.

mpirun -np X -npernode 2 your_app

This would start 2 processes per node. Like this, you can use a larger number of nodes with a smaller number of processes, or you can e.g. start threads out of the processes.


process pinning

If you want to pin your processes to a CPU (and enable NUMA memory affinity) use

mpirun -np X --mca mpi_paffinity_alone 1 your_app


Warning: This will not behave as expected for hybrid multi threaded applications (MPI + OpenMP), as the threads will be pinned to a single CPU as well! Use this only if you want to pin one process per core - no extra threads!


thread pinning

For pinning of hybrid MPI/OpenMP, use the following wrapper script

File: thread_pin_wrapper.sh
#!/bin/bash
export KMP_AFFINITY=verbose,scatter           # Intel specific environment variable
export OMP_NUM_THREADS=4

RANK=${OMPI_COMM_WORLD_RANK:=$PMI_RANK}
if [ $(expr $RANK % 2) = 0  ]
then
     export GOMP_CPU_AFFINITY=0-3
     numactl --preferred=0 --cpunodebind=0 $@
else
     export GOMP_CPU_AFFINITY=4-7
     numactl --preferred=1 --cpunodebind=1 $@
fi


Run your application with the following command

mpirun -np X -npernode 2 thread_pin_wrapper.sh your_app


Warning: Do not use the mpi_paffinity_alone option in this case!


See also

External links