- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Libraries(Hawk): Difference between revisions
m (→Intel MKL hack) |
|||
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== AOCL == | == AOCL == | ||
The [https://developer.amd.com/amd-aocl/ AMD Optimizing CPU Libraries (AOCL)] are optimized to utilize AMD EPYC processors in an optimal way | The [https://developer.amd.com/amd-aocl/ AMD Optimizing CPU Libraries (AOCL)] are optimized to utilize AMD EPYC processors in an optimal way. | ||
<pre>module load aocl/2. | |||
A module is available '''gcc''' and '''aocc''' with | |||
<pre>module load aocl/2.2.0</pre> | |||
Keep in mind this module only contains the proprietary non-MPI components like amd-libm. | |||
The libraries utilizing MPI will be compiled separately with MPT and provided as their own modules at a later time. | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
Line 60: | Line 66: | ||
'''Static linking''' and [https://software.intel.com/en-us/mkl-linux-developer-guide-using-the-ilp64-interface-vs-lp64-interface 32-bit integer interface] are recommended if applicable. | '''Static linking''' and [https://software.intel.com/en-us/mkl-linux-developer-guide-using-the-ilp64-interface-vs-lp64-interface 32-bit integer interface] are recommended if applicable. | ||
For '''distributed''' MKL functionality ('''ScaLAPACK''', etc.), select '''SGIMPT''' for the MPI library, if you use '''mpt'''. | |||
Intel MKL seems to be using a '[https://www.extremetech.com/computing/302650-how-to-bypass-matlab-cripple-amd-ryzen-threadripper-cpus Cripple AMD CPU]' functionality that suppresses AVX/AVX2 code paths for AMD CPUs even though these are capable. | Intel MKL seems to be using a '[https://www.extremetech.com/computing/302650-how-to-bypass-matlab-cripple-amd-ryzen-threadripper-cpus Cripple AMD CPU]' functionality that suppresses AVX/AVX2/FMA code paths for AMD CPUs even though these are capable. | ||
This behavior can be bypassed with an undocumented environment variable set during runtime: | This behavior can be bypassed with an undocumented environment variable set during runtime: | ||
Line 74: | Line 82: | ||
This environment variable is currently set as default for all users. | This environment variable is currently set as default for all users. | ||
<font color="red">'''Attention'''</font>: It seems that with '''mkl/19.1.1''' this bypass | <font color="red">'''Attention'''</font>: It seems that with '''mkl/19.1.1''' this bypass (which worked on ''mkl/19.1.0''') has been '''removed''' thus this version is not made available. | ||
Intel replaced the classic legacy compiler with the [https://www.intel.com/content/www/us/en/developer/articles/technical/adoption-of-llvm-complete-icx.html LLVM-based] oneAPI compiler, which should also support non-Intel CPUs and starting with MKL Release 2020.2 there seem to be specific functions for AMD Zen architectures. But it seems that the best possible performance can be achieved adding a function ''int mkl_serv_intel_cpu_true(){return 1;}'' to overwrite and fake an Intel CPU, which is just another hack. | |||
== Further libraries == | == Further libraries == | ||
In order to get an up-to-date list, please refer to | In order to get an up-to-date list, please refer to | ||
<pre>module avail</pre> | <pre>module avail</pre> |
Latest revision as of 16:19, 26 April 2023
AOCL
The AMD Optimizing CPU Libraries (AOCL) are optimized to utilize AMD EPYC processors in an optimal way.
A module is available gcc and aocc with
module load aocl/2.2.0
Keep in mind this module only contains the proprietary non-MPI components like amd-libm.
The libraries utilizing MPI will be compiled separately with MPT and provided as their own modules at a later time.
Library | Purpose | Source | License | URL |
---|---|---|---|---|
amd-blis/amd-blis-mt | BLIS is a portable software framework for instantiating high-performance
BLAS-like dense linear algebra libraries. |
https://github.com/amd/blis | 3-clause BSD | https://developer.amd.com/amd-aocl/blas-library/ |
amd-fftw | An AMD optimized FFTW that includes selective kernels and routines
optimized for the AMD EPYC™ processor family. |
https://github.com/amd/amd-fftw | GPLv2 | https://developer.amd.com/amd-aocl/fftw/ |
amd-libflame | libFLAME is a portable library for dense matrix computations,
providing much of the functionality present in LAPACK. |
https://github.com/amd/libflame | 3-clause BSD | http://developer.amd.com/amd-cpu-libraries/blas-library/#libflame |
amd-libm | amd-libm implements optimized trigonometric, logarithmic/exponential, power, etc. functions that should perform better than the system libm. Provides vector intrinsics of these as well (currently for __m128 __m128d types). |
proprietary | AMD LibM EULA | http://developer.amd.com/amd-cpu-libraries/amd-math-library-libm/ |
amd-rng | AMD Random Number Generator Library is a pseudorandom number generator library. It provides a comprehensive set of statistical distribution functions and various uniform distribution generators (base generators) including Wichmann-Hill and Mersenne Twister. |
proprietary | AMD RNG EULA | http://developer.amd.com/amd-cpu-libraries/rng-library/ |
amd-securerng | The AMD Secure Random Number Generator (RNG) is a library that provides APIs to access the cryptographically secure random numbers generated by AMD’s hardware-based random number generator implementation. |
proprietary | AMD SecureRNG EULA | http://developer.amd.com/amd-cpu-libraries/rng-library/#securerng |
Intel MKL
Intel MKL is available with
module load mkl/<version>
For linking flags for different compilers, there is an Intel MKL Link Line Advisor
Static linking and 32-bit integer interface are recommended if applicable.
For distributed MKL functionality (ScaLAPACK, etc.), select SGIMPT for the MPI library, if you use mpt.
Intel MKL seems to be using a 'Cripple AMD CPU' functionality that suppresses AVX/AVX2/FMA code paths for AMD CPUs even though these are capable.
This behavior can be bypassed with an undocumented environment variable set during runtime:
export MKL_DEBUG_CPU_TYPE=5 ./a.out
In tests with Intels MKL sample codes setting this environment variable has lead to a 2-3x better performance on a Zen 2 AMD CPU.
This environment variable is currently set as default for all users.
'Attention: It seems that with mkl/19.1.1 this bypass (which worked on mkl/19.1.0) has been removed thus this version is not made available. Intel replaced the classic legacy compiler with the LLVM-based oneAPI compiler, which should also support non-Intel CPUs and starting with MKL Release 2020.2 there seem to be specific functions for AMD Zen architectures. But it seems that the best possible performance can be achieved adding a function int mkl_serv_intel_cpu_true(){return 1;} to overwrite and fake an Intel CPU, which is just another hack.
Further libraries
In order to get an up-to-date list, please refer to
module avail