- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Mem256gb

From HLRS Platforms
Revision as of 14:46, 6 August 2012 by Hwwnec5 (talk | contribs) (Created page with "==large memory nodes=== the new large memory nodes are based on 4 socket AMD Opteron 6238 "interlagos" in the 12 core/2.6 Ghz incarnation. === main hardware features === * ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

large memory nodes=

the new large memory nodes are based on 4 socket AMD Opteron 6238 "interlagos" in the 12 core/2.6 Ghz incarnation.

main hardware features

  • 256GB of DDR3 1600MHz memory, >100GB/s bandwidth
  • 48 cores total

remarks

  • this 4 socket platform contains MultiChipModule (MCM) cpus, each CPU consists of 2 chips with 2 memory channels each, this results in 2 NUMA nodes per socket
  • with 8 NUMA nodes, this system is very NUMA placement sensitive
  • with 48 cores and 256GB it is a good system for shared memory applications, but task placement and numa placement are important, see NUMA Tuning for introduction.
  • thread pinning helps to improve performance, as Intel OpenMP runtime option KMP_AFFINITY is not (yet?) supported on AMD cpus, use likwid-pin instead
$./stream_avx 
[...]
Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:        22724.1176        0.0149        0.0141        0.0158
Scale:       29406.6272        0.0132        0.0109        0.0142
Add:         22833.1340        0.0236        0.0210        0.0260
Triad:       22749.7957        0.0241        0.0211        0.0270

You can improve performance by factor 2-3 like this, assuming an intel compiled openmp program:

$module load performance/likwid
$likwid-pin -t intel stream_avx
[...]
Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:        89282.0648        0.0040        0.0036        0.0041
Scale:      125390.2541        0.0026        0.0026        0.0026
Add:        102237.7575        0.0049        0.0047        0.0051
Triad:      120663.2257        0.0040        0.0040        0.0040

This does not work:

$KMP_AFFINITY=scatter ./stream_avx
[...]
OMP: Warning #72: KMP_AFFINITY: affinity only supported for Intel(R) processors. 
OMP: Warning #71: KMP_AFFINITY: affinity not supported, using "disabled".
[...]