- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

HPE Hunter Hardware and Architecture: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
With respect to some technical details of Hunter Hardware and Architecture, please refer to this [https://kb.hlrs.de/platforms/upload/HLRS_Hunter_mit_Freigabe.pdf '''Slides''']. <br>
Summary
=== Node/Processor ===
=== Node/Processor ===
==== hunter APU compute nodes ====
==== Hunter APU compute nodes ====
==== hunter CPU compute nodes ====
* Blade: HPE Cray EX255a (El Capitan blade architecture, MI-300A)
* APU: AMD Instinct MI300A Accelerator
* 4 APU's per node
* 24 CPU Cores and 228 Compute Cores per APU
* Memory: 512 GB per node (HBM3)
* HBM3 ~5.3 TB/s per APU
* Network: HPE Slingshot 11 (4 injection ports per node, 4x200Gbps)
* Number of nodes: 188
* some nodes (20) have 1 local NVMe M.2 SSD with ~3.5TB installed.
 
 
==== Hunter CPU compute nodes ====
* CPU type: AMD EPYC 9374F 32-Core Processor
* 2 CPU's per node
* 32 Cores per CPU
* 768 GB Speicher (DDR5-4800) per node
* Interconnect: Infiniband HDR200 (1 injection port per node, 1x200Gbps)
* Number of nodes: 256
* some nodes (16) have 1 local NVMe M.2 SSD with ~3.5TB installed.
<br>
 
==== Pre- and post processing ====
==== Pre- and post processing ====
Within HLRS simulation environment special nodes for pre- and post processing tasks are available. Such nodes could be requested via the [[Batch_System_PBSPro_(Hunter) | batch systems]] using the queue "pre" or queue "smp".
Available nodes are
    4 nodes 3 TB Memory, 2 Socket AMD EPYC 9354 32-Core Processor, exclusive usage model, 24TB localscratch, available by queue "pre"
    1 node  6 TB Memory, 2 Socket AMD EPYC 9354 32-Core Processor, shared usage model,    24TB localscratch, available by queue "smp"
more specialized nodes e.g. graphics, vector, DataAnalytics, ... are available in the [[NEC_Cluster_Hardware_and_Architecture_(vulcan)|Vulcan cluster]].<BR>
If you need such specialized nodes on vulcan cluster for pre- or postprocessing inside your project located on hunter resources, please ask your project manager for access to vulcan.


=== Interconnect ===
=== Interconnect ===
HPE Slingshot 11 Dragonfly


=== Filesystem ===
=== Filesystem ===
Available lustre filesystems on hunter:
Available lustre filesystems on hunter:


* ws12:
* ws12 (HPE Cray ClusterStor E1000 Lustre Appliance):
** available storage capacity: ~12 PB
** available storage capacity: ~12 PB
** lustre devices: 2 MDT, 20 OST
** lustre devices: 2 MDT, 20 OST
** performance:  
** performance:  
{{Note| text=
In a further installation phase this year, an HPE Cray ClusterStore E2000 Lustre appliance with a capacity of 13 PB will additional be installed. In a third installation phase, the current HPE ClusterStore E1000 will be upgraded with an E2000).
}}


additional an central HOME and project fileserver is also mounted on hunter.
additional an central HOME and project fileserver is also mounted on hunter.
Some special nodes have a local disk installed which can be uses as localscratch.
Some special nodes have a local disk installed which can be used as localscratch.


See also [[Storage_(Hunter)| Storage (Hunter)]]
See also [[Storage_(Hunter)| Storage (Hunter)]]

Latest revision as of 19:30, 12 January 2025

With respect to some technical details of Hunter Hardware and Architecture, please refer to this Slides.

Summary

Node/Processor

Hunter APU compute nodes

  • Blade: HPE Cray EX255a (El Capitan blade architecture, MI-300A)
  • APU: AMD Instinct MI300A Accelerator
  • 4 APU's per node
  • 24 CPU Cores and 228 Compute Cores per APU
  • Memory: 512 GB per node (HBM3)
  • HBM3 ~5.3 TB/s per APU
  • Network: HPE Slingshot 11 (4 injection ports per node, 4x200Gbps)
  • Number of nodes: 188
  • some nodes (20) have 1 local NVMe M.2 SSD with ~3.5TB installed.


Hunter CPU compute nodes

  • CPU type: AMD EPYC 9374F 32-Core Processor
  • 2 CPU's per node
  • 32 Cores per CPU
  • 768 GB Speicher (DDR5-4800) per node
  • Interconnect: Infiniband HDR200 (1 injection port per node, 1x200Gbps)
  • Number of nodes: 256
  • some nodes (16) have 1 local NVMe M.2 SSD with ~3.5TB installed.


Pre- and post processing

Within HLRS simulation environment special nodes for pre- and post processing tasks are available. Such nodes could be requested via the batch systems using the queue "pre" or queue "smp". Available nodes are

    4 nodes 3 TB Memory, 2 Socket AMD EPYC 9354 32-Core Processor, exclusive usage model, 24TB localscratch, available by queue "pre"
    1 node  6 TB Memory, 2 Socket AMD EPYC 9354 32-Core Processor, shared usage model,    24TB localscratch, available by queue "smp"


more specialized nodes e.g. graphics, vector, DataAnalytics, ... are available in the Vulcan cluster.
If you need such specialized nodes on vulcan cluster for pre- or postprocessing inside your project located on hunter resources, please ask your project manager for access to vulcan.

Interconnect

HPE Slingshot 11 Dragonfly

Filesystem

Available lustre filesystems on hunter:

  • ws12 (HPE Cray ClusterStor E1000 Lustre Appliance):
    • available storage capacity: ~12 PB
    • lustre devices: 2 MDT, 20 OST
    • performance:
Note: In a further installation phase this year, an HPE Cray ClusterStore E2000 Lustre appliance with a capacity of 13 PB will additional be installed. In a third installation phase, the current HPE ClusterStore E1000 will be upgraded with an E2000).


additional an central HOME and project fileserver is also mounted on hunter. Some special nodes have a local disk installed which can be used as localscratch.

See also Storage (Hunter)