- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -
10 minutes before the first job: Difference between revisions
Line 43: | Line 43: | ||
* to start parallel tasks, using ''aprun'' is required (NOT mpirun!!!) | * to start parallel tasks, using ''aprun'' is required (NOT mpirun!!!) | ||
if you start a parallel job using a wrong mechanism, this may cause | if you start a parallel job using a wrong mechanism, this may cause | ||
trouble for all users | trouble for all users. Please consult [[CRAY_XC40_Using_the_Batch_System]] | ||
if unsure contact your project supervisor! | |||
* a task using large amount of memory shold be started on a compute node, | * a task using large amount of memory shold be started on a compute node, | ||
use ''aprun'' to do so | use ''aprun'' to do so |
Revision as of 15:59, 15 March 2018
We ask all users of any server operated by HLRS
Please take 10 minutes to read this article completly!
Within this page we describe basic rules as short as possible, if you want to
know more within this topic, follow the link. But again please read at least
this page!
Storage Storage_usage_policy
No Backup on any filesystem. Please copy important data into the archive.
HOME: Do not run any computational (IO - intensive) job within the HOME directory. For compute jobs use the work space!
Workspace: High performance storage is an expensive ressource. It is intended for active projects only. Move suspended projects into the archive. Each workspace has a lifetime, if this liftime is exceeded, all data will be deleted (automatic). It is possible to receive an email reminder. Copy important data into the archive! More information ==> Workspace_mechanism
Archive: do not store small file in the archive. Please check HPSS_User_Access for more information.
Data transfer to / from the workspace could be done using Data_Transfer_with_GridFTP . Using scp via frontend nodes will fail due to CPU limits
compute server
The frontend/login nodes are behind a firewall. Access and file transfer is only possible for registered IP addresses and only by using ssh protocol. if your IP address changes regularly or you have to work from different locations, you may give VPN a chance.
The frontend nodes have a cpu timelimit of 2h configured. Do not run compute intensive jobs on frontend nodes. Frontend nodes are intented for access, batch submission, filetransfer, workflow and development work.
The compute resources (compute nodes) for parallel compute jobs are only available through the batch system. Please read the batch system documents for the corresponding platform.
Cray Hazel Hen: this system is NOT a cluster. Here we decribe two topics which
caused trouble multiple times:
- to start parallel tasks, using aprun is required (NOT mpirun!!!)
- a task using large amount of memory shold be started on a compute node,
documentation
Online documentations for each compute platform are available adjusted for HLRS/HWW site. There you can find information about how to access, how to use compute resources, how to adapt and develop your application, how to start batch jobs with many examples, tips and specific features at HLRS/HWW site. You need to get an overview about the documents before you start working on a specific compute platform.