|
|
(141 intermediate revisions by 6 users not shown) |
Line 1: |
Line 1: |
− | '''Hawk''' is the next generation HPC system at HLRS. It will replace the existing [[Cray XC40|HazelHen]] system.
| |
− | The installation is planed to take place in Q4 2019. For more detailed information see the [[Hawk installation schedule]].
| |
| | | |
− | This Page is under construction!
| + | {{Note |
| + | | text = Please be sure to read at least the [[10_minutes_before_the_first_job]] document and consult the [[General HWW Documentation]] before you start to work with any of our systems. |
| + | }} |
| | | |
| + | ---- |
| | | |
− | == Topics for Software Installation Meeting ==
| |
| | | |
− | * hands-on sit (software installation tool)
| + | {| style="border:0; margin: 0;" width="100%" cellspacing="10" |
− | * template for modulefiles; see [[#Modulefile best practices]]
| |
− | * hierarchy of modulefiles [[#Hierarchy of modules]]
| |
− | * default compiler / MPI / others
| |
− | * how to deal with software from base linux which should not be visible on cluster, i.e. system compiler (gcc 4.8.5)
| |
− | * how to inform users about software updates in the future
| |
| | | |
− | == Access == | + | | valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" | |
| + | <div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Introduction'''</div> |
| + | <div style="background: #ffffff; padding:0.2em 0.4em;"> |
| + | {| style="border: 0; margin: 0;" cellpadding="3" |
| + | | valign="top" | |
| + | * [[Hawk_installation_schedule#Terms_of_Use | Terms of use ]] |
| + | * [[HPE_Hawk_access|Access]] |
| + | * [[HPE_Hawk_Hardware_and_Architecture|Hardware and Architecture]] |
| + | |} |
| + | </div> |
| | | |
− | Login-Node: hawk-tds-login2.hww.hlrs.de
| |
− | {{note|text=Access to the Hawk TDS is limited to support staff at the moment. Please check the [[Hawk installation schedule]] for details about the start of user access.}}
| |
| | | |
− | == Batch System ==
| |
| | | |
− | [[Batch_System_PBSPro_(Hawk)|Batch System PBSPro (Hawk)]] | + | | valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" | |
| + | <div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Troubleshooting'''</div> |
| + | <div style="background: #ffffff; padding:0.2em 0.4em;"> |
| + | {| style="border: 0; margin: 0;" cellpadding="3" |
| + | | valign="top" | |
| + | * [[HPE_Hawk_Support|Support (contact/staff)]] |
| + | * [[HPE_Hawk_FAQ|FAQ]] |
| + | * [http://websrv.hlrs.de/cgi-bin/hwwweather?task=viewmachine&machine=hawk Status,Maintenance for hawk] |
| + | * [[HPE_Hawk_News|News]] |
| + | |} |
| + | </div> |
| | | |
| | | |
− | == MPI ==
| + | |} |
| | | |
− | In order to use the MPI implementation provided by HPE, please load the Message Passing Toolkit (MPT) module ''mpt'' (not ABI-compatible to other MPI implementations) or ''hmpt'' (ABI-compatible to MPICH-derivatives).
| |
− | For detailed information see the [http://www.hpe.com/support/mpi-ug-036 HPE Message Passing Interface (MPI) User Guide].
| |
| | | |
− | <br>
| |
− | == Modulefile best practices ==
| |
| | | |
− | * Set an environment variable to the root path of your installation (cf. e.g. MPI_ROOT in /usr/share/Modules/modulefiles/hmpt/2.19).
| |
− | * Set not only CPATH but also respective variables used by PGI / Intel / etc. -> someone has to figure out the list of those variables, same w.r.t. (LD_)LiBRARY_PATH.
| |
− | * Include your Name (finger does not work anymore), E-Mail and date of installation into the modulefile.
| |
− | * It's possible to hold the modulefile(s) together with the actual installation in the respective directory and just create symlinks in /opt/hlrs/modulefiles/.
| |
− | * Directory structure of /opt/hlrs/ shall be replicated in /opt/hlrs/unsupported-modulefiles/.
| |
− | * In case of dependencies, load explicit versions instead of default one!
| |
| | | |
− | As an example a modulefile for package "foo" version 1.23 (within the category "performance") should look like:
| |
− |
| |
− | #%Module1.0
| |
− | #
| |
− | # Change log:
| |
− | # Updated 12 Jul 2019, Christoph Niethammer <niethammer@hlrs.de>
| |
− | # Installed 08 Aug 2018, Jose Gracia <gracia@hlrs.de>
| |
− |
| |
− | BASE_DIR=/opt/hlrs/
| |
− | CAT=performance
| |
− | PACKAGE=Foo
| |
− | VERSION=1.23
| |
− |
| |
− | FOO_ROOT=$BASE_DIR/$CAT/$PACKAGE/$VERSION
| |
− |
| |
− | setenv FOO_ROOT $FOO_ROOT
| |
− | setenv FOO_VERSION $VERSION
| |
− |
| |
− | prepend-path PATH $FOO_ROOT/bin
| |
− | prepend-path LD_LIBRARY_PATH $FOO_ROOT/lib # library search path at time of execution (i.e. in case of _dynamic_ linking)
| |
− | prepend-path LIBRARY_PATH $FOO_ROOT/lib # equivalent of "-L" for C, C++, and Fortran
| |
− | prepend-path CPATH $FOO_ROOT/include # equivalent to "-I" for C, C++ and Fortran
| |
− | prepend-path CPLUS_INCLUDE_PATH $FOO_ROOT/include # equivalent to "-I" only for C++ compiler; usually not needed as CPATH will do
| |
− | prepend-path C_INCLUDE_PATH $FOO_ROOT/include # equivalent to "-I" only for C compiler; usually not needed as CPATH will do
| |
− | prepend-path MANPATH $FOO_ROOT/share/man # manpages
| |
| | | |
− | == Hierarchy of modules == | + | {| style="border:0; margin: 0;" width="100%" cellspacing="10" |
− | On Vulcan and Hazel Hen we have various module directories like ''tools'', ''utils'', ''misc''. However, it is not very clear what should go here; at the end anything is a tool. I would therefore propose to be more specific.
| |
| | | |
− | Proposal for module hierarchy (please extend)
| + | | valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" | |
| + | <div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Documentation'''</div> |
| + | <div style="background: #ffffff; padding:0.2em 0.4em;"> |
| + | {| style="border: 0; margin: 0;" cellpadding="3" |
| + | | valign="top" | |
| + | * [[Batch_System_PBSPro_(Hawk)|Batch System]] |
| + | * [[Module environment(Hawk)|Module Environment]] |
| + | * [[Storage_(Hawk)| Storage Description ]] |
| + | * [[Compiler(Hawk)|Compiler]] |
| + | * [[MPI(Hawk)|MPI]] |
| + | * [[Libraries(Hawk)|Libraries]] |
| + | * [[Manuals(Hawk)|Manuals]] |
| + | * [[Optimization|Optimization]] |
| + | |} |
| + | </div> |
| | | |
− | development/ # development tools
| |
− | # svn, git, binutils, cmake
| |
− |
| |
− | mpi/ # nobody will suspect MPI under mpt/
| |
− | # mpt, hmpt, ....
| |
− |
| |
− | compiler/
| |
− | # gcc, oacc, intel, pgi, ...
| |
− |
| |
− | numlib/ # numerical libraries
| |
− | # mkl, trillinos, ...
| |
− |
| |
− | debugger/ # debugging tools
| |
− | # forge, ...
| |
− |
| |
− | performance/ # performance analysis tools
| |
− | # vampir, extrae, scalasca, inspector, advisor, darshan, ...
| |
− |
| |
− | visualization/ # data visualization tools
| |
− | # paraview, ...
| |
− |
| |
− | python/
| |
− | # 3.X; do we need/want 2.7?
| |
| | | |
− | What to do with libraries which are not necessarily "numlib", e.g.
| |
− | boost/ -> libraries/boost
| |
− | hdf5/ -> libraries/hdf5
| |
| | | |
− | Libraries which are actually some kind of programming model:
| + | | valign="top" style="padding: 0; border: 1px solid #aaaaaa; margin-bottom: 0;" | |
− | gpi2 -> libraries/gpi2
| + | <div style="font-size: 105%; padding: 0.4em; background-color: #eeeeee; border-bottom: 1px solid #aaaaaa; text-align: center;">'''Utilities'''</div> |
− | tbb -> libraries/tbb | + | <div style="background: #ffffff; padding:0.2em 0.4em;"> |
| + | {| style="border: 0; margin: 0;" cellpadding="3" |
| + | | valign="top" | |
| + | * [[CAE_utilities|CAE Utilities]] |
| + | * [[MKL | MKL Fortran Interfaces ]] |
| + | |} |
| + | </div> |
| | | |
− | What to do with software for projects? Such as
| + | |} |
− | tools/hidalgo/fenics_hpc/3dairq
| |
− | prace/prace
| |
− | Put them in directories which are readable only for a certain group?
| |
| | | |
− | == Test cases best practices ==
| |
| | | |
− | Test cases will help to identify and determine the scaling/ performance behavior of the new system.
| + | ---- |
− | Ideally, those test cases can be compared to other systems as well to get a full picture.
| + | [[Help | Help for Wiki Usage]] |
− | | |
− | To do:
| |
− | * Definition of a best practice guideline on how to set up a correct test case
| |
− | ** Only measurement of time-stepping loop or equivalent excluding the initialization phase or cleanup
| |
− | ** Well defined measure of computational progress (e.g. LUPS, DoF-UPS, Iterations/s or Flop/s)
| |
− | ** Ideally, the test case is mostly automated with scripts and does also the evaluation on top with a meaningful result file
| |
− | | |
− | | |
− | == TODO ==
| |
− | | |
− | * 2019-08-22, niethammer@hlrs.de: missing pbs headers (tm.h, ...)
| |
− | * 2019-08-18, dick@hlrs.de: (exuberant) ctags missing on frontend, probably available from RHEL repository
| |
− | * 2019-08-18, dick@hlrs.de: manpages are missing on the frontend
| |
− | * 2019-08-15, niethammer@hlrs.de: need more explanation on how ''omplace'' works for pinning in the context of SMT (numbering of cores?)
| |
− | * 2019-08-15, niethammer@hlrs.de: how to run correctly scripts/wrappers with mpirun? (executes script only once per node, but MPI application if called inside multiple times)
| |
− | | |
− | * 2019-08-15, niethammer@hlrs.de: missing commands:
| |
− | **''resize'' (likely coming with xterm)
| |
− | | |
− | * 2019-08-15, niethammer@hlrs.de:
| |
− | ** MPT = Message Passing Toolkit
| |
− | ** MPI from the mpt module uses the SGI ABI, MPI from the hmpt module uses the MPICH ABI
| |
− | ** for the MPI compiler wrappers to detect the correct compiler please set MPICC_CC, MPICXX_CXX, MPIF90_F90, MPIF08_F08 to the corresponding compiler commands (2019-08-15, dick@hlrs.de: done)
| |
− | ** Should e.g. applications using cae/platform_mpi use perfboost?
| |
− | | |
− | * 2019-08-07, dick@hlrs: hmpt is ABI-compatible with MPICH-derivativs, but not so mpt
| |
− | ** user should know about this!
| |
− | ** @HPE: is hmpt a MPICH-derivative, but not so mpt?
| |
− | | |
− | * 2019-08-07, dick@hlrs: unclear that (h)mpt provides MPI lib -> call it "mpi/hmpt" and "mpi/mpt" instead?
| |
− | | |
− | * 2019-08-07, dick@hlrs: remove MPI delivered with RHEL
| |
− | | |
− | * 2019-08-07, dick@hlrs: Intel loads gcc module -> use (LD_)LIBRARY_PATH intead
| |
− | | |
− | * 2019-08-07, dick@hlrs: be careful: cc points to /usr/bin/gcc!
| |
− | | |
− | * 2019-08-14, khabi/offenhaeuser@hlrs.de: How to pin OpenMP-threads in hybrid jobs (naive approach pins 2 threads to _same_ core instead of two different), i.e.: how to do aprun -d?
| |
− | | |
− | <br>
| |
− | | |
− | == Training ==
| |
− | | |
− | There will be internal (i.e. HLRS staff only) trainings on the following topics (tentative):
| |
− | | |
− | * HPE Performance MPI
| |
− | ** [https://terminplaner4.dfn.de/qTagd9VodYiy89mj survey to set date]
| |
− | ** topics available via the above link
| |
− | ** target audience: user support staff, internal users of the system
| |
− | | |
− | * Processor
| |
− | ** [https://terminplaner4.dfn.de/COB6iw5DAFFyDtwe survey to set date]
| |
− | ** (tentative) schedule available via the above link
| |
− | ** target audience: user support staff, internal users of the system
| |
− | | |
− | * Workload Management PBSpro for end users
| |
− | ** survey coming soon
| |
− | ** target audience: user support staff, internal users of the system
| |
− | | |
− | * Cluster and System Administration using HPCM
| |
− | ** target audience: admin staff
| |
− | | |
− | * Infiniband-Administration and Tuning
| |
− | ** target audience: admin staff
| |
− | | |
− | * Lustre and Storage Administration
| |
− | ** target audience: admin staff
| |
− | | |
− | * Workload Management PBSpro Administration and Tuning
| |
− | ** target audience: admin staff
| |