- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

HPE Hawk Hardware and Architecture: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
Line 18: Line 18:
Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here] with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here]). Within the hypercube, the higher the dimension, the less bandwidth is available.
Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here] with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. [https://kb.hlrs.de/platforms/upload/Interconnect_topology.pdf here]). Within the hypercube, the higher the dimension, the less bandwidth is available.
Topology aware scheduling is used to exclude major performance fluctuations. This means that larger jobs can only be requested with defined node numbers (64, 128, 256, 512, 1024, 2048 and 4096) in regular operation. This restriction ensures optimal system utilization while simultaneously exploiting the network topology. Jobs with a node number of < 128 nodes are processed in a special partition. Jobs over 4096 nodes are processed at special times. <br>
Topology aware scheduling is used to exclude major performance fluctuations. This means that larger jobs can only be requested with defined node numbers (64, 128, 256, 512, 1024, 2048 and 4096) in regular operation. This restriction ensures optimal system utilization while simultaneously exploiting the network topology. Jobs with a node number of < 128 nodes are processed in a special partition. Jobs over 4096 nodes are processed at special times. <br>
With respect to further details, please refer to this Slides.
With respect to further details, please refer to this [https://kb.hlrs.de/platforms/upload/Interconnect.pdf Slides].


<br>
<br>

Revision as of 14:57, 5 March 2020

Node/Processor

With respect to details of the processor deployed in Hawk, please refer to this Slides.

Pre- and post processing

Within HLRS simulation environment special nodes for pre- and post processing tasks are available. Such nodes could be requested via the batch system (follow this link for more info). Available nodes are

  table... 
    4 nodes 2 TB Memory 2 Socket AMD ...x TB local storage   shared usage model
    1 Node  4 TB Memory 2 Socket AMD    x TB local storage   shared usage model


more specialized nodes e.g. graphics, vector, DataAnalytics, ... are available in the Vulcan cluster


Interconnect

Hawk deploys an Infiniband HDR based interconnect with a 9-dimensional enhanced hypercube topology. Please refer to here with respect to the latter. Infiniband HDR has a bandwidth of 200 Gbit/s and a MPI latency of ~1.3us per link. The full bandwidth of 200 Gbit/s can be used if communicating between the 16 nodes connected to the same node of the hypercube (cf. here). Within the hypercube, the higher the dimension, the less bandwidth is available. Topology aware scheduling is used to exclude major performance fluctuations. This means that larger jobs can only be requested with defined node numbers (64, 128, 256, 512, 1024, 2048 and 4096) in regular operation. This restriction ensures optimal system utilization while simultaneously exploiting the network topology. Jobs with a node number of < 128 nodes are processed in a special partition. Jobs over 4096 nodes are processed at special times.
With respect to further details, please refer to this Slides.


Filesystem