10 minutes before the first job: Difference between revisions

Latest revision as of 11:37, 10 March 2021

We ask all users of any server operated by HLRS

Please take 10 minutes to read this article completly!

Within this page we describe basic rules as short as possible, if you want to know more within this topic, follow the link. But again please read at least this page!

Storage Storage_usage_policy

No Backup on any filesystem. Please copy important data into the archive.

HOME: Do not run any computational (IO - intensive) job within the HOME directory. For compute jobs use the work space!

Workspace: High performance storage is an expensive ressource. It is intended for active projects only. Move suspended projects into the archive. Each workspace has a lifetime, if this liftime is exceeded, all data will be deleted (automatic). It is possible to receive an email reminder. Copy important data into the archive! More information ==> Workspace_mechanism

Archive: do not store small files in the archive. Please check HPSS_User_Access for more information.

Data transfer to / from the workspace could be done using Data_Transfer_with_GridFTP . Using scp via frontend nodes will fail due to CPU limits

compute server

The frontend/login nodes are behind a firewall. Access and file transfer is only possible for registered IP addresses and only by using ssh protocol. if your IP address changes regularly or you have to work from different locations, you may give VPN a chance.

The frontend nodes have a cpu timelimit of 2h configured. Do not run compute intensive jobs on frontend nodes. Frontend nodes are intented for access, batch submission, filetransfer, workflow and development work. If doing so might require an increased number of resources (compiling, creating tarballs of dozens of MBs or more, copying a large amount of data, etc.), please prefix "nice -+19" to the respective command in order to keep the login node responsive for other users!

The compute resources (compute nodes) for parallel compute jobs are only available through the batch system. Please read the batch system documents for the corresponding platform.

documentation

Online documentations for each compute platform are available adjusted for HLRS/HWW site. There you can find information about how to access, how to use compute resources, how to adapt and develop your application, how to start batch jobs with many examples, tips and specific features at HLRS/HWW site. You need to get an overview about the documents before you start working on a specific compute platform.

@@ Line 16: / Line 16: @@
 '''HOME''': Do not run any computational (IO - intensive) job within the HOME directory. For compute jobs use the work space!
-'''Workspace''' [[Workspace_mechanism]] is an expensive ressource. It is intended for
+'''Workspace''': High performance storage is an expensive ressource. It is intended for
 active projects only. Move suspended projects into the archive. Each workspace
 has a lifetime, if this liftime is exceeded, all data will be deleted (automatic).
 It is possible to receive an email reminder. Copy important data into the archive!
+More information ==> [[Workspace_mechanism]]
-'''Archive''': do not store small file in the archive. Please check [[HPSS_User_Access]] for more information.
+'''Archive''': do not store small files in the archive. Please check [[HPSS_User_Access]] for more information.
 Data transfer to / from the workspace could be done using [[Data_Transfer_with_GridFTP]] . Using scp via frontend nodes will fail due to
@@ Line 32: / Line 33: @@
 The frontend nodes have a cpu timelimit of 2h configured. Do not run compute
-intensive jobs on frontend nodes. Frontend nodes are intented for access, batch submission, filetransfer, workflow and development work.
+intensive jobs on frontend nodes. Frontend nodes are intented for access, batch submission, filetransfer, workflow and development work. If doing so might require an increased number of resources (compiling, creating tarballs of dozens of MBs or more, copying a large amount of data, etc.), please prefix "nice -+19" to the respective command in order to keep the login node responsive for other users!
 The compute resources (compute nodes) for parallel compute jobs are only available through the batch system. Please read the batch system documents for the corresponding platform.
-'''Cray Hazel Hen''': this system is NOT a cluster. Here we decribe two topics which
-caused trouble multiple times:
-<UL>
-* to start parallel tasks, using ''aprun'' is required (NOT mpirun!!!)
-    if you start a parallel job using a wrong mechanism, this may cause
-    trouble for all users
-* a task using large amount of memory shold be started on a compute node,
-    use ''aprun'' to do so
-</UL>
 == documentation ==
 Online documentations for each compute platform are available adjusted for HLRS/HWW site. There you can find information about how to access, how to use compute resources, how to adapt and develop your application, how to start batch jobs with many examples, tips and specific features at HLRS/HWW site. You need to get an overview about the documents before you start working on a specific compute platform.

10 minutes before the first job: Difference between revisions

Latest revision as of 11:37, 10 March 2021

Storage Storage_usage_policy

compute server

documentation

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools