- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

GPI-2

From HLRS Platforms
Revision as of 13:10, 21 November 2018 by Hpcerdei (talk | contribs)
Jump to navigationJump to search
GPI-2 (Global address space Programming Interface) is a threadsafe PGAS API for Infiniband,ROCE,ETHERNET,GEMINI and ARIES networks.

GPI-2 aims at high performance, delivering wire-speed from the interconnect. It relies on one-sided and asynchronous communication that allow a perfect overlap between computation and communication. All GPI2 methods provide a timeout functionality for fault tolerant operation. With that in place, GPI-2 offers mechanisms that allow applications to react to failures and continue its execution.

Copy-logo-fraunhofer-itwm.gif
Developer: Fraunhofer ITWM
Platforms: Hornet (Cray XC30)
Category: Communication libraries
License: commercial
Website: GPI homepage


Using GPI-2 on Cray XC30

Load the necessary module. For example:

module swap PrgEnv-cray PrgEnv-gnu
module load gpi2/1.3.2

Example

File: hello_gpi2.c
#include <stdio.h>
#include <stdlib.h>
#include <GASPI.h>

int main(int argc,char *argv[]){

gaspi_rank_t rank,tnc;
gaspi_float vers,vers_gni;
char mtype[16];

  if( gaspi_proc_init(GASPI_BLOCK) != GASPI_SUCCESS ){
    printf("gaspi_init failed !\n");
    exit(-1);
  }

  gaspi_version(&vers);
  gaspi_proc_rank(&rank);
  gaspi_proc_num(&tnc);

  printf("rank: %d tnc: %d (gpi2: %.2f ugni: %.2f)\n",rank,tnc,vers,vers_gni);

  if( gaspi_barrier(GASPI_GROUP_ALL,GASPI_BLOCK) != GASPI_SUCCESS ){
    printf("gaspi_barrier failed !\n");
    exit(-1);
  }

  gaspi_proc_term(GASPI_BLOCK);
  return 0;
}


Compilation example

cc -O2 hello_gpi2.c -D_GNU_SOURCE -lpmi -lugni -lrca -lGPI2 -o hello_gpi2.bin

Example to run the program

GPI-2 Applications should be started with one process per NUMA Socket. Use threads to exploit SMP parallelism on each NUMA Socket (e.g. mctp3 for best performance). Ex.:

# 2 nodes, 24 cores (ht is enabled), interactive
qsub -I -l nodes=2:ppn=24

# 4 procs on 2 nodes,socket affinity mask, ht on
aprun -q -n 4 -N 2 -cc numa_node -d 12 -ss -j 2 ./hello_gpi2.bin

See also

External links