- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
Mem256gb
From HLRS Platforms
Jump to navigationJump to search
large memory nodes
the 10 new large memory nodes are based on 4 socket AMD Opteron 6238 "interlagos" in the 12 core/2.6 Ghz incarnation.
main hardware features
- 256GB of DDR3 1600MHz memory, >100GB/s bandwidth
- 48 cores total
remarks
- this 4 socket platform contains MultiChipModule (MCM) cpus, each CPU consists of 2 chips with 2 memory channels each, this results in 2 NUMA nodes per socket
- with 8 NUMA nodes, this system is very NUMA placement sensitive
- with 48 cores and 256GB it is a good system for shared memory applications, but task placement and numa placement are important, see NUMA Tuning for introduction.
- thread pinning helps to improve performance, as Intel OpenMP runtime option KMP_AFFINITY is not (yet?) supported on AMD cpus, use likwid-pin instead
$./stream_avx [...] Function Rate (MB/s) Avg time Min time Max time Copy: 22724.1176 0.0149 0.0141 0.0158 Scale: 29406.6272 0.0132 0.0109 0.0142 Add: 22833.1340 0.0236 0.0210 0.0260 Triad: 22749.7957 0.0241 0.0211 0.0270
You can improve performance by factor 2-3 like this, assuming an intel compiler compiled openmp program:
$module load performance/likwid $likwid-pin -t intel stream_avx [...] Function Rate (MB/s) Avg time Min time Max time Copy: 89282.0648 0.0040 0.0036 0.0041 Scale: 125390.2541 0.0026 0.0026 0.0026 Add: 102237.7575 0.0049 0.0047 0.0051 Triad: 120663.2257 0.0040 0.0040 0.0040
This does not work:
$KMP_AFFINITY=scatter ./stream_avx [...] OMP: Warning #72: KMP_AFFINITY: affinity only supported for Intel(R) processors. OMP: Warning #71: KMP_AFFINITY: affinity not supported, using "disabled". [...]