- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -

Difference between revisions of "Open MPI"

From HLRS Platforms
Jump to navigationJump to search
(Added usage examples)
Line 6: Line 6:
 
| license                = New BSD license
 
| license                = New BSD license
 
| website                = [http://www.open-mpi.org/ Open MPI homepage]  
 
| website                = [http://www.open-mpi.org/ Open MPI homepage]  
 +
}}
 +
 +
== Examples ==
 +
 +
==== simple example ====
 +
This example shows the basic steps when using Open MPI.
 +
 +
Load the necessary module
 +
{{Command|command =
 +
module load mpi/openmpi
 +
}}
 +
 +
Compile your application using the mpi wrapper compilers mpicc, mpic++ or mpif90:
 +
{{Command|command =
 +
mpicc your_app.c - o your_app
 +
}}
 +
 +
Now we run our application using 128 processes spread accros 16 nodes in an interactive job (-I option):
 +
{{Command | command =
 +
qsub -l nodes=16:ppn=8,walltime=6:00:00 -I            # get 16 nodes for 6 hours
 +
mpirun -np 128 your_app                              # run your_app using 128 processes
 +
}}
 +
 +
==== specifying the number of processes per node ====
 +
Open MPI divides resources in something called 'slots'. By specifying <code>ppn:X</code> to the batchsystem, the number of slots per node is specified.
 +
So for a simple MPI job with 8 process per node (=1 process per core) <code>ppn:8</code> is best choice, as in above example. Details can be specified on <code>mpirun</code> command line. PBS setup is adjusted for ppn:8, please do not use other values.
 +
 +
If you want to use less processes per node e.g. because you are restricted by memory requirements, or you have a hybrid parallel application using MPI and OpenMP, MPI would always put the first 8 processes on the first node, second 8 on second and so on. To avoid this, you can use the <code>-npernode</code> option. 
 +
{{Command
 +
| command = mpirun -np X -npernode 2 your_app
 +
}}
 +
This would start 2 processes per node. Like this, you can use a larger number of nodes
 +
with a smaller number of processes, or you can e.g. start threads out of the processes.
 +
 +
 +
=== process pinning ===
 +
If you want to pin your processes to a CPU (and enable NUMA memory affinity) use
 +
{{Command
 +
| command = mpirun -np X --mca mpi_paffinity_alone 1 your_app
 +
}}
 +
 +
{{Warning
 +
| text = This will not behave as expected for hybrid multi threaded applications (MPI + OpenMP), as the threads will be pinned to a single CPU as well! Use this only if you want to pin one process per core - no extra threads!
 +
}}
 +
 +
=== thread pinning ===
 +
For pinning of hybrid MPI/OpenMP, use the following wrapper script
 +
{{File|filename=thread_pin_wrapper.sh|content=<pre>
 +
#!/bin/bash
 +
export KMP_AFFINITY=verbose,scatter          # Intel specific environment variable
 +
export OMP_NUM_THREADS=4
 +
 +
RANK=${OMPI_COMM_WORLD_RANK:=$PMI_RANK}
 +
if [ $(expr $RANK % 2) = 0  ]
 +
then
 +
    export GOMP_CPU_AFFINITY=0-3
 +
    numactl --preferred=0 --cpunodebind=0 $@
 +
else
 +
    export GOMP_CPU_AFFINITY=4-7
 +
    numactl --preferred=1 --cpunodebind=1 $@
 +
fi
 +
</pre>
 +
}}
 +
 +
Run your application with the following command
 +
{{Command
 +
| command = mpirun -np X -npernode 2 thread_pin_wrapper.sh your_app
 +
}}
 +
 +
{{Warning| text =
 +
Do not use the mpi_paffinity_alone option in this case!
 
}}
 
}}
  

Revision as of 14:15, 26 February 2010

Open MPI is an Message Passing Interface (MPI) library project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI).
Developer: Open MPI Development Team
Platforms: NEC Nehalem Cluster
Category: MPI
License: New BSD license
Website: Open MPI homepage


Examples

simple example

This example shows the basic steps when using Open MPI.

Load the necessary module

module load mpi/openmpi


Compile your application using the mpi wrapper compilers mpicc, mpic++ or mpif90:

mpicc your_app.c - o your_app


Now we run our application using 128 processes spread accros 16 nodes in an interactive job (-I option):

qsub -l nodes=16:ppn=8,walltime=6:00:00 -I # get 16 nodes for 6 hours mpirun -np 128 your_app # run your_app using 128 processes


specifying the number of processes per node

Open MPI divides resources in something called 'slots'. By specifying ppn:X to the batchsystem, the number of slots per node is specified. So for a simple MPI job with 8 process per node (=1 process per core) ppn:8 is best choice, as in above example. Details can be specified on mpirun command line. PBS setup is adjusted for ppn:8, please do not use other values.

If you want to use less processes per node e.g. because you are restricted by memory requirements, or you have a hybrid parallel application using MPI and OpenMP, MPI would always put the first 8 processes on the first node, second 8 on second and so on. To avoid this, you can use the -npernode option.

mpirun -np X -npernode 2 your_app

This would start 2 processes per node. Like this, you can use a larger number of nodes with a smaller number of processes, or you can e.g. start threads out of the processes.


process pinning

If you want to pin your processes to a CPU (and enable NUMA memory affinity) use

mpirun -np X --mca mpi_paffinity_alone 1 your_app


Warning: This will not behave as expected for hybrid multi threaded applications (MPI + OpenMP), as the threads will be pinned to a single CPU as well! Use this only if you want to pin one process per core - no extra threads!


thread pinning

For pinning of hybrid MPI/OpenMP, use the following wrapper script

File: thread_pin_wrapper.sh
#!/bin/bash
export KMP_AFFINITY=verbose,scatter           # Intel specific environment variable
export OMP_NUM_THREADS=4

RANK=${OMPI_COMM_WORLD_RANK:=$PMI_RANK}
if [ $(expr $RANK % 2) = 0  ]
then
     export GOMP_CPU_AFFINITY=0-3
     numactl --preferred=0 --cpunodebind=0 $@
else
     export GOMP_CPU_AFFINITY=4-7
     numactl --preferred=1 --cpunodebind=1 $@
fi


Run your application with the following command

mpirun -np X -npernode 2 thread_pin_wrapper.sh your_app


Warning: Do not use the mpi_paffinity_alone option in this case!


See also

External links