- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
CPE: Difference between revisions
From HLRS Platforms
Jump to navigationJump to search
Line 99: | Line 99: | ||
* If you try to populate an empty experiment data directory using <code>pat_report</code>, but <code>pat_report</code> can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invoke <code>pat_report</code> with <code>-i <exe></code> as suggested above, or use <code>mpiexec --no-transfer</code> to run your experiment. | * If you try to populate an empty experiment data directory using <code>pat_report</code>, but <code>pat_report</code> can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invoke <code>pat_report</code> with <code>-i <exe></code> as suggested above, or use <code>mpiexec --no-transfer</code> to run your experiment. | ||
* Some runtime libraries, such as libcrayacc, depend on libcuda, which is not available everywhere. If you want to cross-compile GPU code on a login node, invoke the linker with <code>--allow-shlib-undefined</code> (<code>-Wl,--allow-shlib-undefined</code>) to allow undefined symbols in shared objects. The resulting executable will run on an AI node. | * Some runtime libraries, such as libcrayacc, depend on libcuda, which is not available everywhere. If you want to cross-compile OpenACC or OpenMP GPU code on a login node, invoke the linker with <code>--allow-shlib-undefined</code> (<code>-Wl,--allow-shlib-undefined</code>) to allow undefined symbols in shared objects. The resulting executable will run on an AI node. | ||
=== Known Issues and Current Limitations === | === Known Issues and Current Limitations === |
Revision as of 08:26, 21 July 2023
Cray Programming Environment User Guide: CSM for HPE Cray EX Systems HPE CPE User Guide
CPE on Hawk
Setup
Archive cpe_on_hawk.tar contains
$ tar -tf cpe_on_hawk.tar enable_cpe.sh compiler/aocc/lmod/modulefiles/core/ (...) compiler/intel/lmod/modulefiles/core/ (...) test/placement/job.sh
- Bash script to initialize CPE: enable_cpe.sh
- Additional module files for PrgEnv-aocc and PrgEnv-intel: compiler/
- Example PBS job script (for xthi): test/placement/job.sh
To use PrgEnv-aocc and PrgEnv-intel, put compiler in DIR and set REPO_ROOT=DIR in enable_cpe.sh.
Loading CPE
$ module list Currently Loaded Modules: 1) system/site_names 3) system/wrappers/1.0 5) gcc/10.2.0 2) system/ws/1.4.0 4) hlrs-software-stack/current 6) mpt/2.26 $ . enable_cpe.sh $ module list Currently Loaded Modules: 1) system/site_names 6) perftools-base/22.09.0 11) cray-libsci/22.11.1.2 2) system/ws/1.4.0 7) cce/15.0.0 12) cray-pals/1.2.4 3) system/wrappers/1.0 8) craype/2.7.19 13) PrgEnv-cray/8.3.4 4) craype-x86-rome 9) cray-dsmml/0.2.2 5) craype-network-ucx 10) cray-mpich-ucx/8.1.21 $ cc --version Cray clang version 15.0.0 (324a8e7de6a18594c06a0ee5d8c0eda2109c6ac6) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/cray/pe/cce/15.0.0/cce-clang/x86_64/share/../bin
Unloading CPE
$ module restore Resetting modules to system default (...) $ echo $MODULEPATH | tr : '\n' /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/mpt/2.26/gcc/10.2.0 /sw/hawk-rh8/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/mpt/2.26-zbg2lgp/gcc/10.2.0 /opt/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/gcc/10.2.0 /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/gcc/10.2.0 /opt/hlrs/spack/rev-009_2022-09-01/modulefiles/linux-rocky8-x86_64/gcc/8.5.0 /opt/hlrs/non-spack/rev-009_2022-09-01/modulefiles/gcc/8.5.0 /opt/hlrs/non-spack/system/modulefiles $ echo $LD_LIBRARY_PATH | tr : '\n' /opt/hlrs/non-spack/rev-009_2022-09-01/compiler/gcc/10.2.0/mpt_custom_fortran_modules/2.26 /opt/hlrs/non-spack/mpi/mpt/2.26/lib /opt/hlrs/non-spack/rev-009_2022-09-01/compiler/gcc/10.2.0/lib64 /opt/cray/pe/lib64 # <--- Need a proper modulefile to undo this change
General Notes
- Use the compiler drivers
cc
,CC
, andftn
to compile and link C, C++, and Fortran code. If you need to debug a build or want to see which tools are invoked with which flags and options, compile with-v
or--cray-print-opts=all
.
- Cray compilers can generate detailed yet readable optimization reports. Try out
-fsave-loopmark
(C/C++) or-rm
(Fortran) and inspect the resulting listings files (*.lst).
- The Cray Fortran compiler is a highly optimizing compiler that performs extensive analyses at
-O2
or greater. If you value short compile times during development, consider dialing down the optimization level to-O1
or-O0
, or see if-Oipa1
or-Oipa0
makes a difference. Be aware that the default optimization level is-O2
(and-Oipa3
), not-O0
.
- To clarify compiler messages, warnings, errors, and runtime errors, use
explain
, for example,explain ftn-2103
for more information about Fortran warning 2103.
man intro_directives
gives an overview of pragmas and directives supported by the Cray compilers, should you look for specific extensions.
- Prefer
mpiexec
tompirun
in PBS batch scripts, unless you make sure that aliases are expanded, usingshopt -s expand_aliases
, for example.
man mpiexec
contains a few examples of binding processes and threads to CPUs. Use--cpu-bind depth -d $OMP_NUM_THREADS
for hybrid programs. In general, we recommend to play around with xthi to make sure you achieve your preferred binding.
- The Parallel Application Launch Service (PALS) is not configured to allow launching applications on login nodes. You can do
mpirun
ormpiexec
only on compute nodes.
- The environment variable PE_ENV can be used to determine which PrgEnv-<env> is active. If something requires Cray compilers, for example, you can test if
[ "$PE_ENV" = "CRAY" ]
.
- In some cases, it is necessary to prepend CRAY_LD_LIBRARY_PATH to LD_LIBRARY_PATH, in particular when you want to use a non-default version of a dynamic library that is not present in /opt/cray/pe/lib64. If
ldd <exe>
reports missing dependencies, check ifLD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH ldd <exe>
resolves them.
- When running
pat_report
on an empty experiment data directory, include-i <exe>
to specify where your instrumented executable<exe>
can be found. This is more robust than relying on a recorded path, which may not exist anymore at the time of runningpat_report
. You won't need this option if you launch your instrumented program withmpiexec --no-transfer
.
Known Issues and Solutions/Workarounds
- Using PrgEnv-gnu and gcc/12.1.0, you might see ld complain about being "unable to initialize decompress status for section .debug_info" during linking. This warning goes away after loading an older version of gcc, for instance, gcc/10.3.0. Likewise, if you have trouble generating debugging information (
-g
), consider switching to an older version of gcc.
- Using PrgEnv-intel and intel/2022.0.2 or intel-oneapi/2022.0.2, you will see ftn (ifort or ifx) warn about "overriding '-march=core-avx2' with '-march=core-avx2'". The reason is that
-march=core-avx2
is indeed passed twice, by the installation's ifort.cfg or ifx.cfg and by CPE's x86-rome CPU target. You can ignore this warning.
- If you try to populate an empty experiment data directory using
pat_report
, butpat_report
can't open the instrumented executable, referring to a file in /var/run/palsd/... that no longer exists, invokepat_report
with-i <exe>
as suggested above, or usempiexec --no-transfer
to run your experiment.
- Some runtime libraries, such as libcrayacc, depend on libcuda, which is not available everywhere. If you want to cross-compile OpenACC or OpenMP GPU code on a login node, invoke the linker with
--allow-shlib-undefined
(-Wl,--allow-shlib-undefined
) to allow undefined symbols in shared objects. The resulting executable will run on an AI node.
Known Issues and Current Limitations
- The current version of CrayPat (perftools and perftools-lite) has a bug in combination with PBS Pro and PALS that can cause different node names to map to the same node ID. This affects not only the summary printed by
pat_report
("Numbers of PEs per Node"), but also rank order suggestions ("No rank order was suggested because all ranks are on one node.").
- The PGAS programming models (Coarrays and UPC) are not supported on craype-network-ucx.
- The debugging tools, including cray-stat and gdb4hpc, have received a number of important bugfixes and improvements in more recent releases of CPE. We will likely need to upgrade CPE to make these tools work properly.