- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

HPE Hawk: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
No edit summary
No edit summary
Line 7: Line 7:
Best practices for software installation on Hawk is describe on a separate wiki page [[HPE Hawk software installation]].
Best practices for software installation on Hawk is describe on a separate wiki page [[HPE Hawk software installation]].


== Topics for Software Installation Meeting ==
* Installationsverzeichnis: /opt/hlrs; Platform unabhängig (Ausnahme): /sw/general; '''Links zwischen Platformen /sw/* dürfen nicht mehr verwendet werden'''
* Dokumentation der Installation: für User: platforms-Wiki; für Installer: staff-Wiki
* how to inform users about software updates in the future: keine Verbindung von platform ins staff wiki
MOTD?; Mailingliste schlank halten; muss automatisch sein (siehe module-changes email); wer lädt wann welche module -> relevante emails verschicken
* hands-on sit (software installation tool)
* template for modulefiles; see [[#Modulefile best practices]]
* hierarchy of modulefiles [[#Hierarchy of modules]]
* default compiler / MPI / others
* how to deal with software from base linux which should not be visible on cluster, i.e. system compiler (gcc 4.8.5)
* install location of python bindings


== Access ==
== Access ==
Line 35: Line 24:


<br>
<br>
== Modulefile best practices ==
* Set an environment variable to the root path of your installation (cf. e.g. MPI_ROOT in /usr/share/Modules/modulefiles/hmpt/2.19).
* Set not only CPATH but also respective variables used by PGI / Intel / etc. -> someone has to figure out the list of those variables, same w.r.t. (LD_)LiBRARY_PATH.
* Include your Name (finger does not work anymore), E-Mail and date of installation into the modulefile.
* It's possible to hold the modulefile(s) together with the actual installation in the respective directory and just create symlinks in /opt/hlrs/modulefiles/.
* Directory structure of /opt/hlrs/ shall be replicated in /opt/hlrs/unsupported-modulefiles/.
* In case of dependencies, load explicit versions instead of default one!
As an example a modulefile for package "foo" version 1.23 (within the category "performance") should look like:
#%Module1.0
#
# Change log:
#  Updated  12 Jul 2019, Christoph Niethammer <niethammer@hlrs.de>
#  Installed 08 Aug 2018, Jose Gracia <gracia@hlrs.de>
 
conflict module_that_should_not_be_loaded_at_the_same_time
module load module_required_by_this_one
BASE_DIR=/opt/hlrs/
CAT=performance
PACKAGE=Foo
VERSION=1.23
FOO_ROOT=$BASE_DIR/$CAT/$PACKAGE/$VERSION
setenv FOO_ROOT $FOO_ROOT
setenv FOO_VERSION $VERSION
prepend-path PATH                $FOO_ROOT/bin
prepend-path LD_LIBRARY_PATH    $FOO_ROOT/lib        # library search path at time of execution (i.e. in case of _dynamic_ linking)
prepend-path LIBRARY_PATH        $FOO_ROOT/lib        # equivalent of "-L" for C, C++, and Fortran
prepend-path CPATH              $FOO_ROOT/include    # equivalent to "-I" for C, C++ and Fortran
prepend-path CPLUS_INCLUDE_PATH  $FOO_ROOT/include    # equivalent to "-I" only for C++ compiler; usually not needed as CPATH will do
prepend-path C_INCLUDE_PATH      $FOO_ROOT/include    # equivalent to "-I" only for C compiler; usually not needed as CPATH will do
prepend-path MANPATH            $FOO_ROOT/share/man  # manpages
module-whatis "_brief_ description of what is provided by this module"
proc ModulesHelp {} {
    puts stderr "First line of detailed description\n"
    puts stderr "Second line of detailed description\n"
    puts stderr "Third line of detailed description\n"
}
== Hierarchy of modules ==
On Vulcan and Hazel Hen we have various module directories like ''tools'', ''utils'', ''misc''. However, it is not very clear what should go here; at the end anything is a tool. I would therefore propose to be more specific.
Proposal for module hierarchy (please extend)
development/    # development tools
    # svn, git, binutils, cmake
mpi/            # nobody will suspect MPI under mpt/
    # mpt, hmpt, ....
compiler/
    # gcc, oacc, intel, pgi, ...
numlib/        # numerical libraries
    # mkl, trillinos, ...
debugger/      # debugging tools
    # forge, ...
performance/    # performance analysis tools
    # vampir, extrae, scalasca, inspector, advisor, darshan, ...
visualization/  # data visualization tools
    # paraview, ...
python/
    # 3.X; do we need/want 2.7?
What to do with libraries which are not necessarily "numlib", e.g.
boost/ -> libraries/boost
hdf5/  -> libraries/hdf5? io/hdf5?
Libraries which are actually some kind of programming model:
gpi2    -> libraries/gpi2
tbb    -> libraries/tbb
What to do with software for projects? Such as
tools/hidalgo/fenics_hpc/3dairq
prace/prace
Put them in directories which are readable only for a certain group?


== Test cases best practices ==
== Test cases best practices ==

Revision as of 14:29, 28 August 2019

Hawk is the next generation HPC system at HLRS. It will replace the existing HazelHen system. The installation is planed to take place in Q4 2019. For more detailed information see the Hawk installation schedule.

This Page is under construction!

Best Practises for Software Installation

Best practices for software installation on Hawk is describe on a separate wiki page HPE Hawk software installation.


Access

Login-Node: hawk-tds-login2.hww.hlrs.de

Note: Access to the Hawk TDS is limited to support staff at the moment. Please check the Hawk installation schedule for details about the start of user access.


Batch System

Batch System PBSPro (Hawk)


MPI

In order to use the MPI implementation provided by HPE, please load the Message Passing Toolkit (MPT) module mpt (not ABI-compatible to other MPI implementations) or hmpt (ABI-compatible to MPICH-derivatives). For detailed information see the HPE Message Passing Interface (MPI) User Guide.


Test cases best practices

Test cases will help to identify and determine the scaling/ performance behavior of the new system. Ideally, those test cases can be compared to other systems as well to get a full picture.

To do:

  • Definition of a best practice guideline on how to set up a correct test case
    • Only measurement of time-stepping loop or equivalent excluding the initialization phase or cleanup
    • Well defined measure of computational progress (e.g. LUPS, DoF-UPS, Iterations/s or Flop/s)
    • Ideally, the test case is mostly automated with scripts and does also the evaluation on top with a meaningful result file


TODO

  • 2019-08-22, niethammer@hlrs.de: missing pbs headers (tm.h, ...)
  • 2019-08-18, dick@hlrs.de: (exuberant) ctags missing on frontend, probably available from RHEL repository
  • 2019-08-18, dick@hlrs.de: manpages are missing on the frontend
  • 2019-08-15, niethammer@hlrs.de: need more explanation on how omplace works for pinning in the context of SMT (numbering of cores?)
  • 2019-08-15, niethammer@hlrs.de: how to run correctly scripts/wrappers with mpirun? (executes script only once per node, but MPI application if called inside multiple times)
  • 2019-08-15, niethammer@hlrs.de: missing commands:
    • resize (likely coming with xterm)
  • 2019-08-15, niethammer@hlrs.de:
    • MPT = Message Passing Toolkit
    • MPI from the mpt module uses the SGI ABI, MPI from the hmpt module uses the MPICH ABI
    • for the MPI compiler wrappers to detect the correct compiler please set MPICC_CC, MPICXX_CXX, MPIF90_F90, MPIF08_F08 to the corresponding compiler commands (2019-08-15, dick@hlrs.de: done)
    • Should e.g. applications using cae/platform_mpi use perfboost?
  • 2019-08-07, dick@hlrs: hmpt is ABI-compatible with MPICH-derivativs, but not so mpt
    • user should know about this!
    • @HPE: is hmpt a MPICH-derivative, but not so mpt?
  • 2019-08-07, dick@hlrs: unclear that (h)mpt provides MPI lib -> call it "mpi/hmpt" and "mpi/mpt" instead?
  • 2019-08-07, dick@hlrs: remove MPI delivered with RHEL
  • 2019-08-07, dick@hlrs: Intel loads gcc module -> use (LD_)LIBRARY_PATH intead
  • 2019-08-07, dick@hlrs: be careful: cc points to /usr/bin/gcc!
  • 2019-08-14, khabi/offenhaeuser@hlrs.de: How to pin OpenMP-threads in hybrid jobs (naive approach pins 2 threads to _same_ core instead of two different), i.e.: how to do aprun -d?


Training

There will be internal (i.e. HLRS staff only) trainings on the following topics (tentative):

  • HPE Performance MPI
    • survey to set date
    • topics available via the above link
    • target audience: user support staff, internal users of the system
  • Processor
    • survey to set date
    • (tentative) schedule available via the above link
    • target audience: user support staff, internal users of the system
  • Workload Management PBSpro for end users
    • probably one day in week 2019-11-11 tp 2019-11-15
    • target audience: user support staff, internal users of the system
  • Cluster and System Administration using HPCM
    • target audience: admin staff
  • Infiniband-Administration and Tuning
    • target audience: admin staff
  • Lustre and Storage Administration
    • target audience: admin staff
  • Workload Management PBSpro Administration and Tuning
    • target audience: admin staff