- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

NEC Cluster Hardware and Architecture (vulcan): Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
No edit summary
No edit summary
 
(16 intermediate revisions by 3 users not shown)
Line 2: Line 2:
=== Hardware ===
=== Hardware ===


 
The list of currently available hardware can be found [https://kb.hlrs.de/platforms/index.php/Batch_System_PBSPro_(vulcan)#Node_types here].
* ''' Pre- & Postprocessing node''' (''smp'' node)
** 8x Intel [http://ark.intel.com/products/46497/Intel-Xeon-Processor-X7542-(18M-Cache-2_66-GHz-5_86-GTs-Intel-QPI) Xeon X7542], 48 cores total @ 2.67GHz
** 1TB memory
** shared access
 
* '''CascadeLake 40 cores compute nodes''' (''clx'')
** 96 nodes (''clx-25'', ''clx384gb40c'')
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/192446/intel-xeon-gold-6248-processor-27-5m-cache-2-50-ghz.html Xeon Gold 6248], 40 cores total @ 2.50GHz
*** 384GB memory
** 8 nodes (''clx-21'', ''clx384gb40c-ai'')
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/192437/intel-xeon-gold-6230-processor-27-5m-cache-2-10-ghz.html Xeon Gold 6230], 40 cores total @ 2.10GHz
*** 384GB memory
*** 1.8TB NVMe mounted at /localscratch
 
* '''CascadeLake 36 cores compute nodes''' (''clx-ai'') for artificial intelligence and big data applications
** 8 nodes (''clx768gb36c-ai'')
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/192443/intel-xeon-gold-6240-processor-24-75m-cache-2-60-ghz.html Xeon Gold 6240], 36 cores total @ 2.60GHz
*** 768GB memory
*** 8x Nvidia Tesla V100 SXM2 32GB
*** 7.3TB NVMe mounted at /localscratch
*** 220GB SSD mounted at /tmp
 
* '''Haswell 20 Cores compute nodes''' (''hsw'')
** 2x Intel [http://ark.intel.com/de/products/81706/Intel-Xeon-Processor-E5-2660-v3-25M-Cache-2_60-GHz Xeon E5-2660v3], 20 cores total @ 2.60GHz
** 84 nodes (''hsw128gb20c'')
*** 128GB RAM
** 4 nodes (''hsw256gb20c'')
*** 256GB RAM
 
* '''Haswell 24 Cores compute nodes''' (''hsw'')
** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/81908/intel-xeon-processor-e5-2680-v3-30m-cache-2-50-ghz.html Xeon E5-2668v3], 24 cores total @ 2.50GHz
** 152 nodes (''hsw128gb24c'')
*** 128GB memory
** 16 nodes (''hsw256gb24c'')
*** 256GB memory
 
* '''Skylake 40 Cores compute nodes''' (''skl'')
** 100 nodes (''skl192gb40c'')
*** 2x Intel [https://www.intel.com/content/www/us/en/products/processors/xeon/scalable/gold-processors/gold-6138.html Xeon Gold 6138], 40 cores total | 2.00GHz
*** 192GB memory
*'''Visualisation node with AMD graphics''' (''visamd'')
** 6 nodes
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/123551/intel-xeon-silver-4112-processor-8-25m-cache-2-60-ghz.html Xeon Silver 4112], 8 codes total @ 2.60GHz
*** 96GB memory
*** AMD Radeon Pro WX8200
 
*'''Visualisation node with NVIDIA graphics''' (''visnv'')
** 1 node
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/123551/intel-xeon-silver-4112-processor-8-25m-cache-2-60-ghz.html Xeon Silver 4112], 8 cores total @ 2.60GHz
*** 96GB memory
*** Nvidia Quadro RTX 4000
 
* '''Visualisation/GPGPU graphic nodes''' (''visp100'')
** 10 nodes
*** 2x Intel [https://ark.intel.com/content/www/us/en/ark/products/92979/intel-xeon-processor-e5-2667-v4-25m-cache-3-20-ghz.html Xeon E5-2667v4], 16 cores total @ 3.20GHz
*** 256GB memory
*** Nvidia Tesla P100 12GB
*** 3.7TB SSD mounted at /localscratch
*** 400GB SSD mounted at /tmp
 
 
* '''Interconnect''': [http://de.wikipedia.org/wiki/Infiniband InfiniBand]
** Various generations of Infiniband switches with QDR, FDR, EDR and HDR speed


=== Architecture ===
=== Architecture ===


The NEC Cluster platform (vulcan) consists of several '''frontend nodes''' for interactive access (for access details see [[NEC_Cluster_access_(vulcan)| Access]]) and several compute nodes of different types for execution of parallel programs. Some parts of the compute nodes comes from the old NEC Cluster laki.  
The NEC Cluster platform (vulcan) consists of several '''frontend nodes''' for interactive access (for access details see [[NEC_Cluster_access_(vulcan)| Access]]) and several compute nodes of different types for execution of parallel programs. Some parts of the compute nodes comes from the old NEC Cluster laki.  


'''Compute node types installed:'''  
'''Compute node types installed:'''  
* Sandybridge, Haswell, Skylake
* Intel Xeon Broadwell, Skylake, CascadeLake
* different Memory nodes (32GB, 64GB, 128GB, 256GB, 384GB)
* AMD Epyc Rome, Genoa
* Pre-Postprocessing node with very large memory (1TB)
* different Memory sizes (256GB, 384GB, 512GB, 768GB)
* Visualisation/GPU nodes with Nvidia Quadro FX5800 or Nvidia Tesla P100
* Pre-Postprocessing node with very large memory (1.5TB, 3TB)
 
* Visualisation/GPU nodes with AMD Radeon Pro WX8200, Nvidia Quadro RTX4000 or Nvidia A30


* Vector nodes with NEC Aurora TSUBASA CPUs
    
    
'''Features'''
'''Features'''
* Operating System: Centos 7
* Operating System: Rocky Linux 8
* Batchsystem: PBSPro
* Batchsystem: PBSPro
* node-node interconnect: Infiniband + GigE
* node-node interconnect: Infiniband + 10G Ethernet
* Global Disk 500 TB (lustre) for vulcan + 500TB (lustre) for vulcan2
* Global Disk 2.2 PB (lustre) for vulcan + 500TB (lustre) for vulcan2
* Many Software Packages for Development
* Many Software Packages for Development
=== History ===
{{Warning
| text = Hardware Upgrade took place on 2024-05-24<br>
Some of the compute nodes and network infrastructure of vulcan has been replaced by up to date hardware.
}}
{| class="wikitable" border="1" cellpadding="2"
|+'''Replacement Overview:'''
|-
|'''node_type'''||'''historical node number'''||'''current node number'''
|-
|''aurora''|| 8 || 8
|-
|''clx-21''|| 8 || 8
|-
|''clx-25''        || 96 ||        96
|-
|<font color=red>''clx-ai''</font>        ||  4 ||          <font color=red>0</font>
|-
|<font color=red>''hsw128gb20c''</font>  || 84 ||          <font color=red>0</font>
|-
|<font color=red>''hsw128gb24c''</font>  || 152 ||          <font color=red>0</font>
|-
|<font color=red>''hsw256gb20c''</font>  || 4 ||          <font color=red>0</font>
|-
|<font color=red>''hsw256gb24c''</font>  || 16 ||          <font color=red>0</font>
|-
|<font color=red>''k20xm''</font>        ||  3 ||          <font color=red>0</font>
|-
|''p100''          ||  3 ||          3
|-
|''skl''          || 68 ||        72
|-
|''smp''          ||  2 ||          1
|-
|''visamd''        ||  6 ||          6
|-
|''visnv''        ||  2 ||          2
|-
|<font color=red>''visp100''</font>      || 10 ||          <font color=red>0</font>
|-
|''rome256gb32c''  ||  3 ||          3 <sup>(1)(2)</sup>
|-
|''rome512gb96c-ai'' || 10 ||        10 <sup>(1)(3)</sup>
|-
|<font color=green>''genoa''</font>          || 0 ||        <font color=green>60</font> <sup>(4)(5)</sup>
|-
|<font color=green>''genoa-a30''</font>      || 0 ||        <font color=green>24</font> <sup>(4)(6)</sup>
|-
|<font color=green>''genoa-smp''</font>      || 0 ||          <font color=green>2</font> <sup>(4)(7)</sup>
|-
|}
<sup>
(1) academic usage only<br>
(2) 2x AMD Epic 7302 Rome, 3.0GHz base, 32 cores total, 256GB DDR4, 3.5TB NVMe<br>
(3) 2x AMD Epyc 7642 Rome, 2.3GHz base, 96 cores total, 512GB DDR4, 1.8TB NVMe, 8x AMD Instinct Mi50 with 32GB<br>
(4) new nodes, node_type not yet fixed<br>
(5) 2x AMD Epyc 9334 Genoa, 2.7GHz base, 64 cores total, 768GB DDR5<br>
(6) 2x AMD Epyc 9124 Genoa, 3.0GHz base, 32 cores total, 768GB DDR5, 3.8TB NVMe, 1x Nvidia A30 with 24GB HBM2e<br>
(7) 2x AMD Epyc 9334 Genoa, 2.7GHz base, 64 cores total, 3072GB DDR5<br>
</sup>

Latest revision as of 08:41, 25 October 2024

Hardware

The list of currently available hardware can be found here.

Architecture

The NEC Cluster platform (vulcan) consists of several frontend nodes for interactive access (for access details see Access) and several compute nodes of different types for execution of parallel programs. Some parts of the compute nodes comes from the old NEC Cluster laki.

Compute node types installed:

  • Intel Xeon Broadwell, Skylake, CascadeLake
  • AMD Epyc Rome, Genoa
  • different Memory sizes (256GB, 384GB, 512GB, 768GB)
  • Pre-Postprocessing node with very large memory (1.5TB, 3TB)
  • Visualisation/GPU nodes with AMD Radeon Pro WX8200, Nvidia Quadro RTX4000 or Nvidia A30
  • Vector nodes with NEC Aurora TSUBASA CPUs

Features

  • Operating System: Rocky Linux 8
  • Batchsystem: PBSPro
  • node-node interconnect: Infiniband + 10G Ethernet
  • Global Disk 2.2 PB (lustre) for vulcan + 500TB (lustre) for vulcan2
  • Many Software Packages for Development


History

Warning: Hardware Upgrade took place on 2024-05-24
Some of the compute nodes and network infrastructure of vulcan has been replaced by up to date hardware.


Replacement Overview:
node_type historical node number current node number
aurora 8 8
clx-21 8 8
clx-25 96 96
clx-ai 4 0
hsw128gb20c 84 0
hsw128gb24c 152 0
hsw256gb20c 4 0
hsw256gb24c 16 0
k20xm 3 0
p100 3 3
skl 68 72
smp 2 1
visamd 6 6
visnv 2 2
visp100 10 0
rome256gb32c 3 3 (1)(2)
rome512gb96c-ai 10 10 (1)(3)
genoa 0 60 (4)(5)
genoa-a30 0 24 (4)(6)
genoa-smp 0 2 (4)(7)

(1) academic usage only
(2) 2x AMD Epic 7302 Rome, 3.0GHz base, 32 cores total, 256GB DDR4, 3.5TB NVMe
(3) 2x AMD Epyc 7642 Rome, 2.3GHz base, 96 cores total, 512GB DDR4, 1.8TB NVMe, 8x AMD Instinct Mi50 with 32GB
(4) new nodes, node_type not yet fixed
(5) 2x AMD Epyc 9334 Genoa, 2.7GHz base, 64 cores total, 768GB DDR5
(6) 2x AMD Epyc 9124 Genoa, 3.0GHz base, 32 cores total, 768GB DDR5, 3.8TB NVMe, 1x Nvidia A30 with 24GB HBM2e
(7) 2x AMD Epyc 9334 Genoa, 2.7GHz base, 64 cores total, 3072GB DDR5