- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

HPE Hawk: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
No edit summary
 
(170 intermediate revisions by 9 users not shown)
Line 1: Line 1:
'''Hawk''' is the next generation HPC system at HLRS. It will replace the existing [[Cray XC40|HazelHen]] system.
The installation is planed to take place in Q4 2019. For more detailed information see the [[Hawk installation schedule]].


This Page is under construction!
{{Note
| text = Please be sure to read at least the [[10_minutes_before_the_first_job]] document and consult the [[General HWW Documentation]] before you start to work with any of our systems.
}}


{{Warning
| text = In prepartion of the next generation supercomputer [[ Hunter_(HPE) | Hunter ]], the hardware configuration has been reduced (from 5632 compute nodes to 4096 compute nodes). Workspace filesystem ws10 has been removed.
}}


== Access ==


Login-Node: hawk-tds-login2.hww.hlrs.de
----
<br>
Please have in mind that access to the Hawk TDS is limited to hpcoft-accounts at the moment.
<br>




== Batch System ==
{| style="border:0; margin: 0;" width="100%" cellspacing="10"


[[Batch_System_PBSPro_(Hawk)|Batch System PBSPro (Hawk)]]
| valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" |
<div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Introduction'''</div>
<div style="background: #ffffff; padding:0.2em 0.4em;">
{| style="border: 0; margin: 0;" cellpadding="3"
| valign="top" | 
<!-- * [[Hawk_installation_schedule#Terms_of_Use | Terms of use ]] -->
* [[HPE_Hawk_access|Access]]
* [[HPE_Hawk_Hardware_and_Architecture|Hardware and Architecture]]
|}
</div>




== MPI ==


In order to use the MPI implementation provided by HPE, please load the Message Passing Toolkit (MPT) module ''mpt'' (not ABI-compatible to other MPI implementations) or ''hmpt'' (ABI-compatible to MPICH-derivatives).
| valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" |
For detailed information see the [http://www.hpe.com/support/mpi-ug-036 HPE Message Passing Interface (MPI) User Guide].
<div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Troubleshooting'''</div>
<div style="background: #ffffff; padding:0.2em 0.4em;">
{| style="border: 0; margin: 0;" cellpadding="3"
| valign="top" | 
* [[HPE_Hawk_Support|Support (contact/staff)]]
* [[HPE_Hawk_FAQ|FAQ]]
* [http://websrv.hlrs.de/cgi-bin/hwwweather?task=viewmachine&machine=hawk Status,Maintenance for hawk]
* [[HPE_Hawk_News|News]]
|}
</div>


<br>
== Modulefile best practices ==


* Set an environment variable to the root path of your installation (cf. e.g. MPI_ROOT in /usr/share/Modules/modulefiles/hmpt/2.19).
|}
* Set not only CPATH but also respective variables used by PGI / Intel / etc. -> someone has to figure out the list of those variables, same w.r.t. (LD_)LiBRARY_PATH.
* Include your Name (finger does not work anymore), E-Mail and date of installation into the modulefile.
* It's possible to hold the modulefile(s) together with the actual installation in the respective directory and just create symlinks in /opt/hlrs/modulefiles/.
* Directory structure of /opt/hlrs/ shall be replicated in /opt/hlrs/unsupported-modulefiles/.
* In case of dependencies, load explicit versions instead of default one!


As an example a modulefile for package "foo" version 1.23 (within the category "performance") should look like:
#%Module1.0
BASE_DIR=/opt/hlrs/
CAT=performance
PACKAGE=Foo
VERSION=1.23
FOO_ROOT=$BASE_DIR/$CAT/$PACKAGE/$VERSION
setenv FOO_ROOT $FOO_ROOT
setenv FOO_VERSION $VERSION
prepend-path PATH                $FOO_ROOT/bin
prepend-path LD_LIBRARY_PATH    $FOO_ROOT/lib        # library search path at time of execution
prepend-path LIBRARY_PATH        $FOO_ROOT/lib        # equivalent of "-L" for C, C++, and Fortran; works for gcc and clang
prepend-path CPATH              $FOO_ROOT/include    # equivalent to "-I" for C and C++; works with gcc and clang
prepend-path CPLUS_INCLUDE_PATH  $FOO_ROOT/include    # equivalent to "-I" only for C++ compiler; usually not needed as CPATH will do
#prepend-path C_INCLUDE_PATH      $FOO_ROOT/include    # equivalent to "-I" only for C compiler; usually not needed as CPATH will do
###  not verified yet, seems to be intel specific: prepend-path FPATH              $FOO_ROOT/include    # same for Fortran


== Test cases best practices ==


Test cases will help to identify and determine the scaling/ performance behavior of the new system.
Ideally, those test cases can be compared to other systems as well to get a full picture.


To do:
* Definition of a best practice guideline on how to set up a correct test case
** Only measurement of time-stepping loop or equivalent excluding the initialization phase or cleanup
** Well defined measure of computational progress (e.g. LUPS, DoF-UPS, Iterations/s or Flop/s)
** Ideally, the test case is mostly automated with scripts and does also the evaluation on top with a meaningful result file


{| style="border:0; margin: 0;" width="100%" cellspacing="10"


== TODO ==
| valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" |
* 2019-08-15, niethammer@hlrs.de: need more explanation on how ''omplace'' works for pinning in the context of SMT (numbering of cores?)
<div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Documentation'''</div>
* 2019-08-15, niethammer@hlrs.de: how to run correctly scripts/wrappers with mpirun? (executes script only once per node, but MPI application if called inside multiple times)
<div style="background: #ffffff; padding:0.2em 0.4em;">
{| style="border: 0; margin: 0;" cellpadding="3"
| valign="top" | 
* [[Batch_System_PBSPro_(Hawk)|Batch System]]
* [[Module environment(Hawk)|Module Environment]]
* [[Storage_(Hawk)| Storage Description ]]
* [[Compiler(Hawk)|Compiler]]
* [[MPI(Hawk)|MPI]]
* [[Libraries(Hawk)|Libraries]]
* [[Manuals(Hawk)|Manuals]]
* [[Optimization|Optimization]]
* [[Hawk_PrePostProcessing|Pre- and Post-Processing]]
* [[Big_Data,_AI_Aplications_and_Frameworks|Big Data, AI Applications and Frameworks]]
* [[Performance Analysis Tools]]
* [[CPE|Cray Programming Environment (CPE)]]


* 2019-08-15, niethammer@hlrs.de: missing commands:
|}
**''resize'' (likely coming with xterm)
</div>


* 2019-08-15, niethammer@hlrs.de:
** MPT = Message Passing Toolkit
** MPI from the mpt module uses the SGI ABI, MPI from the hmpt module uses the MPICH ABI
** for the MPI compiler wrappers to detect the correct compiler please set MPICC_CC, MPICXX_CXX, MPIF90_F90, MPIF08_F08 to the corresponding compiler commands (2019-08-15, dick@hlrs.de: done)
** Should e.g. applications using cae/platform_mpi use perfboost?


* 2019-08-07, dick@hlrs: hmpt is ABI-compatible with MPICH-derivativs, but not so mpt
** user should know about this!
** @HPE: is hmpt a MPICH-derivative, but not so mpt?


* 2019-08-07, dick@hlrs: unclear that (h)mpt provides MPI lib -> call it "mpi/hmpt" and "mpi/mpt" instead?
| valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" |
<div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Utilities'''</div>
<div style="background: #ffffff; padding:0.2em 0.4em;">
{| style="border: 0; margin: 0;" cellpadding="3"
| valign="top"
* [[CAE_utilities|CAE Utilities]]
* [[CAE_howtos|CAE HOWTOs]]
* [[MKL | MKL Fortran Interfaces ]]
|}
</div>


* 2019-08-07, dick@hlrs: remove MPI delivered with RHEL
|}


* 2019-08-07, dick@hlrs: Intel loads gcc module -> use (LD_)LIBRARY_PATH intead


* 2019-08-07, dick@hlrs: be careful: cc points to /usr/bin/gcc!
----
 
[[Help | Help for Wiki Usage]]
* 2019-08-14, khabi/offenhaeuser@hlrs.de: How to pin OpenMP-threads in hybrid jobs (naive approach pins 2 threads to _same_ core instead of two different), i.e.: how to do aprun -d?
 
<br>

Latest revision as of 08:44, 25 October 2024

Note: Please be sure to read at least the 10_minutes_before_the_first_job document and consult the General HWW Documentation before you start to work with any of our systems.


Warning: In prepartion of the next generation supercomputer Hunter , the hardware configuration has been reduced (from 5632 compute nodes to 4096 compute nodes). Workspace filesystem ws10 has been removed.




Introduction


Troubleshooting




Documentation


Utilities



Help for Wiki Usage