- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

CPE: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
Line 97: Line 97:
* Using PrgEnv-intel and intel/2022.0.2 or intel-oneapi/2022.0.2, you will see ftn (ifort or ifx) warn about "overriding '-march=core-avx2' with '-march=core-avx2'". The reason is that <code>-march=core-avx2</code> is indeed passed twice, by the installation's ifort.cfg or ifx.cfg and by CPE's x86-rome CPU target. You can ignore this warning.
* Using PrgEnv-intel and intel/2022.0.2 or intel-oneapi/2022.0.2, you will see ftn (ifort or ifx) warn about "overriding '-march=core-avx2' with '-march=core-avx2'". The reason is that <code>-march=core-avx2</code> is indeed passed twice, by the installation's ifort.cfg or ifx.cfg and by CPE's x86-rome CPU target. You can ignore this warning.


* The current version of CrayPat (perftools and perftools-lite) has a bug in combination with PBS Pro and PALS that can cause different node names to map to the same node ID. This affects not only the summary printed by <code>pat_report</code> ("Numbers of PEs per Node"), but also rank order suggestions ("No rank order was suggested because all ranks are on one node.").
* If you try to populate an empty experiment data directory using <code>pat_report</code>, but <code>pat_report</code> can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invoke <code>pat_report</code> with <code>-i <exe></code> as suggested above, or use <code>mpiexec --no-transfer</code> to run your experiment.
 
* Some runtime libraries, such as libcrayacc, depend on libcuda, which is not available everywhere. If you want to cross-compile GPU code on a login node, invoke the linker with <code>--allow-shlib-undefined</code> (<code>-Wl,--allow-shlib-undefined</code>) to allow undefined symbols in shared objects. The resulting executable will run on an AI node.


* If you try to populate an empty experiment data directory using <code>pat_report</code>, but <code>pat_report</code> can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invoke <code>pat_report</code> with <code>-i <exe></code> as suggested above, or use <code>mpiexec --no-transfer</code> to run your experiment.
=== Known Issues and Current Limitations ===


=== Known Limitations ===
* The current version of CrayPat (perftools and perftools-lite) has a bug in combination with PBS Pro and PALS that can cause different node names to map to the same node ID. This affects not only the summary printed by <code>pat_report</code> ("Numbers of PEs per Node"), but also rank order suggestions ("No rank order was suggested because all ranks are on one node.").


* The PGAS programming models (Coarrays and UPC) are not supported on craype-network-ucx.
* The PGAS programming models (Coarrays and UPC) are not supported on craype-network-ucx.


* The debugging tools, including cray-stat and gdb4hpc, have received a number of bugfixes and improvements in more recent releases of CPE. We will likely need to upgrade CPE to make these tools work properly.
* The debugging tools, including cray-stat and gdb4hpc, have received a number of important bugfixes and improvements in more recent releases of CPE. We will likely need to upgrade CPE to make these tools work properly.

Revision as of 14:39, 18 July 2023

Cray Programming Environment User Guide: CSM for HPE Cray EX Systems HPE CPE User Guide

CPE on Hawk

Setup

Archive cpe_on_hawk.tar contains

 $ tar -tf cpe_on_hawk.tar
 enable_cpe.sh
 compiler/aocc/lmod/modulefiles/core/
 (...)
 compiler/intel/lmod/modulefiles/core/
 (...)
 test/placement/job.sh
  • Bash script to initialize CPE: enable_cpe.sh
  • Additional module files for PrgEnv-aocc and PrgEnv-intel: compiler/
  • Example PBS job script (for xthi): test/placement/job.sh

To use PrgEnv-aocc and PrgEnv-intel, put compiler in DIR and set REPO_ROOT=DIR in enable_cpe.sh.

Loading CPE

 $ module list
 
 Currently Loaded Modules:
   1) system/site_names   3) system/wrappers/1.0           5) gcc/10.2.0
   2) system/ws/1.4.0     4) hlrs-software-stack/current   6) mpt/2.26
 
 $ . enable_cpe.sh
 $ module list
 
 Currently Loaded Modules:
   1) system/site_names     6) perftools-base/22.09.0  11) cray-libsci/22.11.1.2
   2) system/ws/1.4.0       7) cce/15.0.0              12) cray-pals/1.2.4
   3) system/wrappers/1.0   8) craype/2.7.19           13) PrgEnv-cray/8.3.4
   4) craype-x86-rome       9) cray-dsmml/0.2.2
   5) craype-network-ucx   10) cray-mpich-ucx/8.1.21
 
 $ cc --version
 Cray clang version 15.0.0  (324a8e7de6a18594c06a0ee5d8c0eda2109c6ac6)
 Target: x86_64-unknown-linux-gnu
 Thread model: posix
 InstalledDir: /opt/cray/pe/cce/15.0.0/cce-clang/x86_64/share/../bin

Unloading CPE

 $ module restore
 Resetting modules to system default (...)
 
 $ echo $MODULEPATH | tr : '\n'
 /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/mpt/2.26/gcc/10.2.0
 /sw/hawk-rh8/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/mpt/2.26-zbg2lgp/gcc/10.2.0
 /opt/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/gcc/10.2.0
 /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/gcc/10.2.0
 /opt/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/gcc/8.5.0
 /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/gcc/8.5.0
 /opt/hlrs/non-spack/system/modulefiles
 
 $ echo $LD_LIBRARY_PATH | tr : '\n'
 /opt/hlrs/non-spack/rev-009_2022-09-01/compiler/gcc/10.2.0/mpt_custom_fortran_modules/2.26
 /opt/hlrs/non-spack/mpi/mpt/2.26/lib
 /opt/hlrs/non-spack/rev-009_2022-09-01/compiler/gcc/10.2.0/lib64
 /opt/cray/pe/lib64 # <--- Need a proper modulefile to undo this change

General Notes

  • Use the compiler drivers cc, CC, and ftn to compile and link C, C++, and Fortran code. If you need to debug a build or want to see which tools are invoked with which flags and options, compile with -v or --cray-print-opts=all.
  • Cray compilers can generate detailed yet readable optimization reports. Try out -fsave-loopmark (C/C++) or -rm (Fortran) and inspect the resulting listings files (*.lst).
  • The Cray Fortran compiler is a highly optimizing compiler that performs extensive analyses at -O2 or greater. If you value short compile times during development, consider dialing down the optimization level to -O1 or -O0, or see if -Oipa1 or -Oipa0 makes a difference. Be aware that the default optimization level is -O2 (and -Oipa3), not -O0.
  • To clarify compiler messages, warnings, errors, and runtime errors, use explain, for example, explain ftn-2103 for more information about Fortran warning 2103.
  • man intro_directives gives an overview of pragmas and directives supported by the Cray compilers, should you look for specific extensions.
  • Prefer mpiexec to mpirun in PBS batch scripts, unless you make sure that aliases are expanded, using shopt -s expand_aliases, for example.
  • man mpiexec contains a few examples of binding processes and threads to CPUs. Use --cpu-bind depth -d $OMP_NUM_THREADS for hybrid programs. In general, we recommend to play around with xthi to make sure you achieve your preferred binding.
  • The Parallel Application Launch Service (PALS) is not configured to allow launching applications on login nodes. You can do mpirun or mpiexec only on compute nodes.
  • The environment variable PE_ENV can be used to determine which PrgEnv-<env> is active. If something requires Cray compilers, for example, you can test if [ "$PE_ENV" = "CRAY" ].
  • In some cases, it is necessary to prepend CRAY_LD_LIBRARY_PATH to LD_LIBRARY_PATH, in particular when you want to use a non-default version of a dynamic library that is not present in /opt/cray/pe/lib64. If ldd <exe> reports missing dependencies, check if LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH ldd <exe> resolves them.
  • When running pat_report on an empty experiment data directory, include -i <exe> to specify where your instrumented executable <exe> can be found. This is more robust than relying on a recorded path, which may not exist anymore at the time of running pat_report. You won't need this option if you launch your instrumented program with mpiexec --no-transfer.

Known Issues and Solutions/Workarounds

  • Using PrgEnv-gnu and gcc/12.1.0, you might see ld complain about being "unable to initialize decompress status for section .debug_info" during linking. This warning goes away after loading an older version of gcc, for instance, gcc/10.3.0. Likewise, if you have trouble generating debugging information (-g), consider switching to an older version of gcc.
  • Using PrgEnv-intel and intel/2022.0.2 or intel-oneapi/2022.0.2, you will see ftn (ifort or ifx) warn about "overriding '-march=core-avx2' with '-march=core-avx2'". The reason is that -march=core-avx2 is indeed passed twice, by the installation's ifort.cfg or ifx.cfg and by CPE's x86-rome CPU target. You can ignore this warning.
  • If you try to populate an empty experiment data directory using pat_report, but pat_report can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invoke pat_report with -i <exe> as suggested above, or use mpiexec --no-transfer to run your experiment.
  • Some runtime libraries, such as libcrayacc, depend on libcuda, which is not available everywhere. If you want to cross-compile GPU code on a login node, invoke the linker with --allow-shlib-undefined (-Wl,--allow-shlib-undefined) to allow undefined symbols in shared objects. The resulting executable will run on an AI node.

Known Issues and Current Limitations

  • The current version of CrayPat (perftools and perftools-lite) has a bug in combination with PBS Pro and PALS that can cause different node names to map to the same node ID. This affects not only the summary printed by pat_report ("Numbers of PEs per Node"), but also rank order suggestions ("No rank order was suggested because all ranks are on one node.").
  • The PGAS programming models (Coarrays and UPC) are not supported on craype-network-ucx.
  • The debugging tools, including cray-stat and gdb4hpc, have received a number of important bugfixes and improvements in more recent releases of CPE. We will likely need to upgrade CPE to make these tools work properly.