- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -

Difference between revisions of "HPE Hawk"

From HLRS Platforms
(Pre- and post processing)
(3 intermediate revisions by the same user not shown)
Line 12: Line 12:
 
== Hardware ==
 
== Hardware ==
 
=== Node/Processor ===
 
=== Node/Processor ===
 +
 +
Compute nodes as well as login nodes are equipped with
 +
  AMD EPYC 7702 64-Core Processor
 +
detailed information will be provided later. Please check for additional infos [https://www.amd.com/de/products/cpu/amd-epyc-7702 AMD Rome 7702]
 
With respect to node and processor details cf. here.
 
With respect to node and processor details cf. here.
  
Line 18: Line 22:
 
=== Interconnect ===
 
=== Interconnect ===
 
Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here] with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here]). Within the hypercube, the higher the dimension, the less bandwidth is available.
 
Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here] with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here]). Within the hypercube, the higher the dimension, the less bandwidth is available.
 +
Topology aware scheduling is used to exclude major performance fluctuations. This means that larger jobs can only be requested with defined node numbers (64, 128, 256, 512, 1024, 2048 and 4096) in regular operation. This restriction ensures optimal system utilization while simultaneously exploiting the network topology. Jobs with a node number of < 128 nodes are processed in a special partition. Jobs over 4096 nodes are processed at special times.
  
 
<br>
 
<br>
Line 36: Line 41:
  
 
<br>
 
<br>
 +
 +
== Pre- and post processing ==
 +
 +
Within HLRS simulation environment special nodes for pre- and post processing tasks are available. Such nodes could be requested via the batch system (follow this link for more info).
 +
Available nodes are
 +
  table...
 +
    4 nodes 2 TB Memory 2 Socket AMD ...x TB local storage  shared usage model
 +
    1 Node  4 TB Memory 2 Socket AMD    x TB local storage  shared usage model
 +
 +
 +
more specialized nodes e.g. graphics, vector, DataAnalytics, ... are available in the [[NEC_Cluster_Hardware_and_Architecture_(vulcan)|Vulcan cluster]]
  
 
== Compiler ==
 
== Compiler ==

Revision as of 12:18, 12 February 2020

For the Hawk installation schedule please see [Hawk installation schedule].


If your job does not start, please have in mind the time-dependent limitations according to Batch System!

This Page is under construction!

The information below applies to the Test and Development System (TDS) which is similar to the future Hawk production system. Please have in mind that this is a system under construction. Hence modifications might occur without announcement and stuff may not work as expected from time to time!


Hardware

Node/Processor

Compute nodes as well as login nodes are equipped with

 AMD EPYC 7702 64-Core Processor

detailed information will be provided later. Please check for additional infos AMD Rome 7702 With respect to node and processor details cf. here.


Interconnect

Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to here with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. here). Within the hypercube, the higher the dimension, the less bandwidth is available. Topology aware scheduling is used to exclude major performance fluctuations. This means that larger jobs can only be requested with defined node numbers (64, 128, 256, 512, 1024, 2048 and 4096) in regular operation. This restriction ensures optimal system utilization while simultaneously exploiting the network topology. Jobs with a node number of < 128 nodes are processed in a special partition. Jobs over 4096 nodes are processed at special times.


Filesystem


Access

Login-Node: hawk-tds-login1.hww.hlrs.de

Note: Access to the Hawk TDS is possible now on request. In case you have early access, we ask you to provide us with your experience regarding usage and performance (approximately half a page) once a month.



Module environment

cf. here


Pre- and post processing

Within HLRS simulation environment special nodes for pre- and post processing tasks are available. Such nodes could be requested via the batch system (follow this link for more info). Available nodes are

  table... 
    4 nodes 2 TB Memory 2 Socket AMD ...x TB local storage   shared usage model
    1 Node  4 TB Memory 2 Socket AMD    x TB local storage   shared usage model


more specialized nodes e.g. graphics, vector, DataAnalytics, ... are available in the Vulcan cluster

Compiler

cf. here


MPI

Tuned MPI: In order to use the MPI implementation provided by HPE, please load the Message Passing Toolkit (MPT) module mpt (not ABI-compatible to other MPI implementations) or hmpt (ABI-compatible to MPICH-derivatives).

User Guide: For detailed information cf. the HPE Message Passing Interface (MPI) User Guide.

Performance optimization: With respect to MPI performance optimization by means of tuning environment variables please cf. Tuning of MPT


Libraries

cf. here



Batch System

cf. here



Disk storage

Home directories as well as workspaces are handled in the same way as on Hazel Hen, so please cf. Storage Description regarding details.



Manuals

Processor:


MPI:


Batch system: